Thursday, 31 July 2014

nt.number theory - Why is "h" the notation for class numbers?

F. Cajori gives several pointers in his A history of mathematical notations, Vol. 2, page 40. I think (he's a bit unclear...) he attributes the notation to Kronecker, referring to Dickson's History, Vol. 3, page 93. Dickson, in turn, in page 138 of that volume, tells us that Kronecker uses that notation in [Sitzungsberichte Akad. d. Wissensch. (Berlin, 1885), Vol. II, p. 768-80]



He apparently had introduced numbers $F(d)$, $G(d)$, $E(d)$, and when he needed one more, he used $H$ :P



(Reading on, we find the first appearence of a lowercase $h$ in Dickson referring to a paper of Weber (Göttingen Nachr., 1893, 138--147, 263--4), so---since Dickson uses notation from the papers he is quoting, we can blame Weber for the change of case)

gr.group theory - Relations between two particular elements of SL_2(Z)?

Let $S_4 = left(begin{array}{cc}0&-1 \ 1&0 end{array}right) textrm{ and } S_6 = left(begin{array}{cc} 1&-1 \ 1&0end{array}right)$. Serre proves in his book on trees that $SL_2(mathbb{Z}) cong mathbb{Z}/4 *_{mathbb{Z}/2} mathbb{Z}/6$, and $S_4$ and $S_6$ are the elements corresponding to the generators of $mathbb Z/4$ and $mathbb Z/6$ (I'm not sure if this is related to my question). Then let $a = S_4 S_6$ and $b = S_4 S_6^2$. I believe every element of $SL_2(mathbb Z)$ can be written as $S_6^d w S_6^e$, where $w$ is a word in $a$ and $b$ but not $a^{-1}$ or $b^{-1}$.



I wrote a program (for other purposes) that seems to show that there aren't any relations between $a$ and $b$ that have length 15 or less and don't involve $a^{-1}$ or $b^{-1}$. I'm not certain that the program is right, but if it is, one might make a naive guess that these two elements generate a free group. This makes me suspicious.



1) Does $SL_2(mathbb Z)$ contain a free group (of rank > 1)? If it does, is there an easy way to determine whether the subgroup generated by $a$ and $b$ is free?



2) A slightly less naive guess is that $a$ and $b$ generate a free monoid in $SL_2(mathbb Z)$. Is there a reason why $SL_2(mathbb Z)$ can't contain a free monoid, or an example showing that it does?



EDIT: Thanks for the quick replies. As Robin and Jack pointed out, $a$ and $b$ generate SL(2,Z), so clearly don't generate a free group. Also, there are free subgroups that are easy to write down. I'm still curious about #2, though.

Sunday, 27 July 2014

co.combinatorics - Some extremal problem for uniform hypergraph with fixed number of edges.

Let S = {1, .. ,n}.
Let H = (S, E) be the m-uniform hypergraph with r edges.
Let F(H, k) = #{B | |B| = k, $exists R in E, B cap R = emptyset$ } -
a number of k-subsets, that doesn't intersect with edges of H.



I am interested in following two problems:




Let r, m, k be fixed parameters.



  1. For which hypergraph H, F(H, k) be minimal?

  2. For which H, F(H, k) be maximal?



I have a conjecture for the first problem, but can't prove it.



Introduce a total order $preceq$ on m-element subsets of S: $a preceq b$ iff $exists s$, such that $s in a, s notin b$ and $forall i < s, i in a cap b mbox{ or } i notin a cup b$.
Example: 3-subsets from 5 element set in ascending order according to $preceq$.
11100
11010
11001
10110
10101
10011
01110
01101
01011
01111




Conjecture For all k minimal value of F(H, k) achieved on hypergraph, whose edges are r first sets from ${S choose m}$ according to $preceq$ order.


mathematical finance - Matching Dynamic Trading Strategies with Derivatives

I understand your problem is:



(i) You have a time-dependent p&l(I call it pay-off) from trading a portfolio of financial assets (although this doesn't really matter)



(ii) You want to find a static portfolio of assets, replicating this p&l in every state of the world (called the replicating portfolio).



This can be done very easy if:



(a) you work in discrete time, i.e. the pay-off you are trying to match is a finite vector of random variables $G_1$, ..., $G_T$



(b) the pay-off does not depend on the path of the underlying assets you are trading in.



Under these conditions it suffices to find a replicating portfolio at a single time $t$ of pay-off. The pay-off is a real-valued function $f=f(S_t)$ within some function-space and what you really want to do is to approximate $f$ by a set of basis function (i.e. the options).



Of course there are many ways to approximate $f$ as well as many possible choices for basis functions. The best approach will depend on your specific set-up. You have to decide
(A) What function space you are working in: Is piecewise linear OK or will your trading strategy produce discontinuities?
(B) What are your basis instruments? If you are looking for a practical solution you will be restricted to traded instruments, i.e. no far out-of-the money stuff, no digital options and so on.
(C) What is your notion of error? $L^2$ is natural but maybe you have a specific utility or risk aversion?
(D) What is your method of discretizing domain and range of $f$?



Anyway if there are not too many times and underlying assets involved, this might be something simple enough to do in Excel.



On the other hand if condition (b) above is not fulfilled, you are most likely in for some trouble. Path dependency creates lots of problems ("curse of dimension") and from a practical perspective you will have trouble finding sufficiently many path-dependent instruments for replication. The method of choice in that case is a regression approach. I.e. regress the pay-off from your strategy against various instruments you deem appropriate.

Saturday, 26 July 2014

lo.logic - Is every model of ZF countable "seen from the outside"?

Michael Greinecker's answer is correct. But there is another subtler weaker sense in which what you asked for could be true.



Namely, the method of forcing shows that every set that exists in any model of set theory can become countable in another larger model of set theory, the forcing extension obtained by collapsing the cardinality of that set to ω. Thus, the concept of countability loses its absolute meaning; whether a set is countable or not depends on the set theoretic background.



So if X is any set, then in some forcing extension of the universe, the set X becomes countable.



In particular, this will be true when X is itself a model of set theory.



There are various ways of viewing the nature of existence of these forcing extensions, among them Boolean-valued models and their quotients, the Boolean ultrapowers, and I refer you to the set-theoretic literature. If one uses the Boolean ultrapower, then stating the theorem from inside V, one attractive way to describe the situation is as follows:



Theorem. If V is the universe of sets and X is any particular set, then there is a class model (W,E) of set theory, and an elementary embedding of V into a submodel W0 of W, such that the image of X in W is thought by W to be countable.



Basically, the model W0 is the Boolean ultrapower of V by the forcing B to collapse X, using any ultrafilter on B, and W is the quotient of the Boolean valued model VB. The elementary embedding maps every object y to the equivalence class of the check name of y.

ag.algebraic geometry - on chern classes and Riemann Roch theorem for torsion-free sheaves on singular (possibly multiple) curve

  1. I'm looking for a definition of Chern class (at least the first one) for a torsion-free sheaf $F$ (not necessarily locally free) on a singular curve (for simplicity can assume all the singularities are planar).

The Chern class can be, of course, extracted from an exact sequence relating $F$ to some locally free sheaves. But I would like some more direct definition, like the one given by Hartshorne (Generalized divisors on Gorenstein curves and a theorem of Noether. J. Math. Kyoto Univ. 26 (1986), no. 3, 375--386).



At least for the first Chern class.



1.5 Even if one wants to define $c_1(F)$ from some resolution: torsion free sheaves on singular curves sometimes have no finite locally free resolutions. What would you do in this case?



  1. I'm looking for the Riemann-Roch for torsion-free sheaves on a singular curve (can assume the singularities to be planar). For example Hartshorne in the paper above does it for rank one.

Of course, if the only definition of the first Chern class is from the exact sequence, then Riemann-Roch is tautological (an alternative way to define $c_1(F)$). So this question is meaningful modulo the first question.



Somehow I do not find all this in classical textbooks.



Thanks to everybody!!!!

ca.analysis and odes - Atiyah-Singer index theorem

You need to understand pseudodifferential operators if you want to understand the original statement of the full Atiyah-Singer index theorem. However, in most applications to differential geometry, only the theorem for twisted Dirac operators is needed. (One of the main results of Atiyah and Singer is that the Bott periodicity theorem - or rather, its generalization to vector bundles, the Thom isomorphism theorem for K-theory - reduces the general case to that of twisted Dirac operators.)



If you want to learn the theory of pseudodifferential operators, I recommend the original papers of Kohn and Nirenberg and Hörmander. This theory is not needed to prove the Atiyah-Singer index theorem: you can get away with the existence of an asymptotic solution of the heat equation. To see this in action, see the paper of McKean and Singer.



One advantage of the heat-kernel approach is that it is well-adapted to study the generalizations of the theory, such as the theory of analytic torsion and the family index theorem.

Algorithm generalizing continued fractions for non-quadratic algebraic numbers

One generalization is to the theory of sails. If $A$ is an $ntimes n$ integer matrix whose eigenvalues are all real, positive, irrational and distinct, a collection of $n$ suitable eigenvectors spans a polyhedral cone which is invariant under $A$. The convex hull of the set of integer lattice points in this cone is a polyhedron, and the vertices of this polyhedron are the ``best'' integral approximations to the eigenvectors. Also see Arnold, e.g.



MR1704965 (2000h:11012)
Arnold, V. I.(RS-AOS)
Higher-dimensional continued fractions. (English, Russian summary)
J. Moser at 70 (Russian).
Regul. Chaotic Dyn. 3 (1998), no. 3, 10--17.

ac.commutative algebra - Differences between reflexives and projectives modules

Well, the answer is well known of course. For a finitely generated module over a commutative normal Noetherian domain TFAE



  1. M is reflexive

  2. M is torsion-free and equals the intersection of its localizations at the codimension 1 primes

  3. M satisfies Serre's condition S2 and its support is Spec R.

  4. M is the dual of a f.g. module N

As you say, a finite projective module is the same as a locally free sheaf on Spec R. Similarly, a finite reflexive module is the same as the push forward of a locally free sheaf from an open subset U of Spec R whose complement has codimension $ge2$.



So for an easy example take a Weil divisor D which is not Cartier, the associated divisorial sheaf (corresponding to an R-module of rank 1) is reflexive, not projective. Your example with a line on a quadratic cone is of this form.



This stuff is standard and used all the time in higher-dimensional algebraic geometry around the Minimal Model Program. For an old reference covering some of this, see e.g. Bourbaki, Chap.7 Algebre commutative, VII.

Wednesday, 23 July 2014

Proof of Bloch-Kato conjecture of K-theory?

Here are some lectures by Charles Weibel. Early on, they discuss Milnor Conjecture and Bloch-Kato, and they should go through the proof. My understanding is that there were a bunch of people involved in the proof, though a few were a bit reticent to actually write up their parts of it, and so Weibel drew the short straw and is the one writing it up.



EDIT: Adding a bit, here's Weibel's 2006 page where he notes the status as of then, and to make sure that this is roughly self-contained, here's the statement:




For an odd prime $ell$, and a field $k$ containing $1/ell$, the Milnor K-theory $K^M_n(k)/ell$ is isomorphic to the étale cohomology $H^n_{ét}(k,μ_ell^n)$
of the field $k$ with coefficients in the twists of $μ_ell$.


Tuesday, 22 July 2014

co.combinatorics - Balancing problem

There was a problem in an Olympiad selection test, which went as follows: Consider the set ${1,2,dots,3n }$ and partition it into three sets A, B and C of size n each. Then, show that there exist x, y and z, one in each of the three sets, such that x + y = z.



This has a tricky-to-get but otherwise straightforward solution, that starts by assuming 1 to be in A, finding the smallest k not in A, assuming that to be in B, and then arguing that no two consecutive elements can be present in C (for that would give an infinite descent). Finally, cardinality considerations solve the problem.



I managed to prove a corresponding statement for 4n, namely: for ${ 1,2,3, dots, 4n }$, partitioned into four sets of size n each, there exist x, y, z, and w, one in each set, such that x + y = z + w.



The question here is whether analogues of this hold for all m, with $m ge 3$ and $n ge 2$. In other words, if ${ 1,2, dots, mn }$ is divided into $m$ sets of size $n$ each, can we always make a choice of one element in each set such that the sum of floor $m/2$ of the elements equals the sum of the remaining ceiling $m/2$ elements ($(m-1)/2$ and $(m + 1)/2$ for $m$ odd, $m/2$ each for $m$ even). Note we need $n ge 2$ due to parity considerations when $m$ is congruent to $1$ or $2$ modulo $4$.

lo.logic - definition of the set of natural numbers

Since the set N (or ω) of natural numbers is the minimal inductive set, a natural definition of N would be that it is the intersection of all inductive sets:



N = { x : (∀I)(I is inductive → x ∈ I) }



However, this does not fit the comprehension scheme which only allows the formation of sets of the form {x ∈ A: φ(x)} for some set A and formula φ(x). The trick is to pick an arbitrary inductive set I0 and then define



N = { x ∈ I0 : (∀I)(I is inductive → x ∈ I) }.



Note that this is equivalent to the above definition since every set x that belongs to every inductive set necessarily belongs to the particular inductive set I0, no matter what I0 we picked.



This situation is not unique to the set of natural numbers. The same problem occurs when trying to define the empty set via comprehension. A natural definition is



∅ = { x : x ≠ x },



but this is not admissible since x is not restricted to a set. Again, we can pick an arbitrary set A and define



∅ = { x ∈ A : x ≠ x }.



This is of course independent of our choice of A since ∅ is contained in every set.



Why are such arbitrary choices necessary? This is because the language of set theory is a purely relational language: there are no constants and functions, only variable symbols together the relations = and ∈. Therefore there is no closed expression for any set whatsoever!!!



The sets ∅ and ω are merely informal notations for specific sets. Even the set-builder notation is informal. Remember that the notation



{ x ∈ A : φ(x) }



is not part of the language of set theory, it is merely a convenient notation for the unique set B such that



(∀x)(x ∈ B ↔ x ∈ A ∧ φ(x)).



As explained in this answer by Joel Hamkins, it is perfectly reasonable to use set-builder notation, or the constants ∅ and ω within set theory. It can be shown that adding formal function or constant symbols for these is a conservative extension of the axioms ZF/ZFC for set theory. Indeed, every model of ZF has a unique interpretation for these symbols. In practice, set theorists freely use such symbols and notations without worry, since they could always expand the language to include these if they were so inclined.



In the end, the decision by Hrbacek & Jech to define N using the informal notation



N = { x ∈ I0 : (∀I)(I is inductive → x ∈ I) }



is that this is correct and it also implicitly justifies the existence of the set N, whereas the even more informal



N = { x : (∀I)(I is inductive → x ∈ I) }



is also correct but it does not justify the existence of N since it does not fit the comprehension scheme.

Saturday, 19 July 2014

ag.algebraic geometry - Is being torsion a local property of module elements?

No, being torsion is not a local property, and I can give a counterexample. [Edit: This took some doing, with my initial answer containing a serious flaw. After completely reworking the construction, this should work now. Apologies for the length of this answer, but I don't see any quick constructions].



The idea is to construct a ring $R$ and an ideal $I$ contained in the zero divisors of $R$, and $f_1,f_2in R$ satisfying $f_1+f_2=1$ such that $I_{f_i}$ contains a regular element of $R_{f_i}$ for each $i$.
This provides a counterexample to the question by taking the module $M=R/I$ and $m=I+1$. Then, ${rm ann}(m)=I$ consists of zero divisors, so $m$ is not torsion. However, mapping $m$ into $M_{f_i}$ takes ${rm ann}(m)$ to $I_{f_i}$, which contains regular elements of $R_{f_i}$. So, $m$ is torsion in each $M_{f_i}$.



This does get rather involved, so let's start simple and construct an example showing that being torsion is not a stalk-local property.



Choose a field $k$, set $A=k[X_0,X_1,X_1,X_2,ldots]$ and let $J$ be the ideal generated by $X_iX_j$ for $inot=j$ and $X_i(X_i-1)$ for $ige1$. Then, $R=A/J$ is the $k$-algebra generated by elements $x_0,x_1,ldots$ satisfying the relations $x_ix_j=0$ for $inot=j$ and $x_i(x_i-1)=0$ for $ige1$. Let $Isubseteq R$ be the ideal generated by $x_0,x_1,ldots$.
We can see that $x_inot=0$ by considering the $k$-morphism $Ato k$ taking $X_j$ to 1 (some fixed $j$) and $X_i$ to 0 for $inot=j$. This takes $J$ to 0, so it defines a $k$-morphism $Rto k$ mapping $x_j$ to 1, so $x_jnot=0$. Then, every $ain I$ satisfies $ax_j=0$ for large $j$, showing that it is a zero divisor. Also, the $k$-morphism $Ato k$ taking each $X_i$ to zero contains $J$ in its kernel, and defines a morphism $Rtomathcal{k}$ with kernel $I$, showing that $R/Icong k$. So $I$ is a maximal ideal. For any prime $mathfrak{p}$ we either have $mathfrak{p}not=I$, in which case the non-empty set $Isetminusmathfrak{p}$ maps to units (and hence, regular elements) in $R_{mathfrak{p}}$. Or, we have $mathfrak{p}=I$ in which case $x_i-1$ maps to a unit and $x_i$ goes to zero in $R_{mathfrak{p}}$ ($ige1$). So, $R_{mathfrak{p}}cong k[X]$ with $x_0$ going to the regular element $X$. This shows that $I$ contains regular elements in the localization at any prime, giving the required counterexample for the stalk-local case.



Now, let's move on to the full construction of the counterexample showing that being torsion is not a local property. Simply guessing a set of generators and relations as for the stalk-local case didn't work out so well. Instead, I will start with a simple example of a polynomial ring and then transform it in such a way as to give the properties we are looking for. I find it helpful to first fix the following notation: Start with the base (polynomial) ring $R=mathbb{Z}[x,y,z]$. A (commutative, unitial) R-algebra is simply a ring with three distinguished elements $x,y,z$, and a morphism of R-algebras is just a ring homomorphism respecting these distinguished elements. For an R-algebra $A$, define $K(A)subseteq A$ to be the smallest ideal such that, for all $ain A$,
$$
begin{align}
axin K(A)&Rightarrow azin K(A),\\
ayin K(A)&Rightarrow a(1-z)in K(A).
end{align}
$$
In particular, $K(A)=0$ implies that $x$ is a regular element in the localization $A_z$ and $y$ is a regular element in $A_{1-z}$. If we can construct such an example where the ideal $Ax+Ay$ consists purely of zero divisors, then that will give the counterexample needed. The idea is to start with $A=mathbb{Z}[x,y,z]$ and transform it using the following steps.



  • Force the elements of $I=Ax+Ay$ to be zero divisors. So, for each $ain I$, add an element $b$ to $A$ in as free a way as possible such that $ab=0$. Adding elements to $A$ also has the effect of adding elements to $I$. So, this step needs to be iterated to force these new elements of $I$ to also be zero divisors.

  • Replace $A$ by the quotient $A/K(A)$ to force the condition $K(A)=0$.

The first step above is easy enough. However, we do need to be careful to check that the second step does not undo the first. Suppose that $ain A$ is a zero divisor, so that $ab=0$ for some non-zero $b$. It is possible that taking the quotient in the second step above takes $b$ to zero, so that $a$ becomes a regular element again. To get around this, we need some stronger condition on $b$ which implies $bnot=0$ and is also stable under taking the quotient. Note that $A(1-b)$ being a proper ideal or, equivalently, $A/(1-b)$ being nontrivial, will imply that $bnot=0$. In turn, this is implied by $K(A/(1-b))$ being a proper ideal. As it turns out, this property of $b$ does remain stable under each of the steps above, and can be used to show that this construction does give the counterexample required. However, note that if $ab=0$ and $K(A/(1-b))$ is proper, then $a=a(1-b)in A(1-b)$, from which we can deduce that $K(A/(a))$ is a proper ideal. This necessary condition is unchanged by either of the steps above, so we had better check that elements $ain Ax+Ay$ in our R-algebra do satisfy this from the outset. I'll make the following definition: $A$ satisfies property (P) if $K(A/(a))$ is proper for every $ain Ax+Ay$. As it turns out, polynomial rings do satisfy this property and, consequently, the construction outlined above works fine.



Now on to the details of the argument.




(1) Let $fcolon Ato B$ be an R-morphism. Then $f(K(A))subseteq K(B)$. Furthermore,



  • If $Isubseteq A$, $Jsubseteq B$ are ideals with $f(I)subseteq J$ and $K(B/J)$ is proper, then $K(A/I)$ is proper.

  • If $B$ satisfies (P) then so does $A$.



As $f^{-1}(K(B))$ satisfies the defining properties for $K(A)$ (other than minimality), it contains $K(A)$. In particular, if $K(B)$ is proper then $K(A)subseteq f^{-1}(K(B))$ is proper. Next, if $f(I)subseteq J$ are ideals, then $f$ induces an R-morphism $A/Ito B/J$ so, if $K(B/J)$ is proper then so is $K(A/I)$.



If $B$ satisfies property (P) and $ain Ax+Ay$ then $f(a)in Bx+By$ and $K(B/f(a))$ is proper. So, $K(A/(a))$ is proper and $A$ also satisfies property (P).




(2) If $A$ is a non-trivial ring then the polynomial ring $Rotimes Acong A[x,y,z]$ satisfies (P).




As $A$ is non-trivial, it must have a maximal ideal $mathfrak{m}$. Applying (1) to the R-morphism $A[x,y,z]to(A/mathfrak{m})[x,y,z]$ reduces to the case where $A=k$ is a field. Then, letting $bar k$ be the algebraic closure, applying (1) to $k[x,y,z]tobar k[x,y,z]$ reduces to the case where $A=k$ is an algebraically closed field.



Now set $B=k[x,y,z]$ and choose $ain Bx+By$. The idea is to look at the morphism $thetacolon B/(a)to k$ taking $x,y,z$ to some $x_0,y_0,z_0in k$ with $a(x_0,y_0,z_0)=0$. As long as these satisfy $ux_0=0Rightarrow uz_0=0$ and $uy_0=0Rightarrow u(1-z_0)=0$ (all $uin k$) then $K(B/(a))$ will be contained in the kernel of $theta$, so will be proper. For this to be the case it is enough that both ($x_0not=0$ or $z_0=0$) and ($y_0not=0$ or $z_0=1$).



Case 1: We can find $a(x_0,y_0,z_0)=0$ such that $x_0y_0not=0$. This satisfies the required condition.



Case 2: Whenever $a(x_0,y_0,z_0)=0$ then $x_0y_0=0$. This means that $xy$ is contained in the radical ideal generated by $a$, so $a$ divides $x^ry^r$ some $rge1$. Then $a$ is a multiple of $x$ or $y$ and one of $(x_0,y_0,z_0)=(0,1,0)$ or $(1,0,1)$ satisfies the required condition. So, $K(B/(a))$ is proper.



Next, we construct extensions of the R-algebra forcing elements of $Ax+Ay$ to be zero-divisors.




(3) If $A$ satisfies (P), then we can construct an R-morphism $fcolon Ato B$ with a left-inverse and such that, for every $ain Ax+Ay$, there is a $bin B$ with $ab=0$ and $K(B/(1-b))$ is proper.




To construct the morphism, set $I=Ax+Ay$ and let $(X_a)_{ain I}$ be indeterminates over $A$. Let $J$ be the ideal in $A[(X_a)_{ain I}]$ generated by $(aX_a)_{ain I}$. Then define $B=R[(X_a)_{ain I}]/J$ and let $f$ be the canonical homomorphism. Its left inverse is the map taking $X_a$ to 0.



Now, for a fixed $ain I$, set $b=J+X_a$, so $ab=0$. Consider the morphism $A[(X_c)_{cin I}]to Ato A/(a)$ taking each $X_c$ to 0 (for $cnot=a$) and $X_a$ to 1. As its kernel contains $J$, it defines a morphism $gcolon Bto A/(a)$, which takes $b$ to one. Therefore, the ideal $B(1-b)$ maps to 0 and, as $K(A/(a))$ is proper, (1) says that $K(B/(1-b))$ is proper.




(4) If $A$ satisfies (P), then we can construct an R-morphism $fcolon Ato B$ such that, for every $ain Bx+By$ there is a $bin B$ with $ab=0$ and $K(B/(1-b))$ is proper.




Set $A_0=A$ and use (3) to construct a sequence of extensions $f_icolon A_ito A_{i+1}$ with left inverses such that, for every $ain A_ix+A_iy$ there is a $bin A_{i+1}$ with $ab=0$ and $K(A_{i+1}/(1-b))$ is proper. Note that, as each $f_i$ has a left inverse, (1) says that $A_{i+1}$ satisfies (P) whenever $A_i$ does. So, we can keep applying (3) to build up the entire sequence of extensions.



We now take the colimit $B={rm colim}A_i$ and let $f$ be the induced morphism (i.e, if we consider $A_isubseteq A_{i+1}$ then $B$ is the union and $f$ is inclusion). As each $A_ito B$ has a left-inverse, (1) shows that the required properties for $B$ are inherited from the individual $A_i$.




(5) Suppose that $A$ satisfies the following: for every $ain Ax+Ay$ there is a $bin A$ with $ab=0$ and $K(A/(1-b))$ is proper. Then, the R-algebra $B=A/K(A)$ satisfies the same property, and also $K(B)=0$.




That $K(B)=0$ follows quickly from the definition of $K$.
Suppose $ain Ax+Ay,bin A$ are such that $ab=0$ and $K(A/(1-b))$ is proper. Set $C=A/(1-b)$, so that $C/K(C)$ is not trivial. By (1), the canonical morphism $Ato C$ maps $K(A)$ into $K(C)$. So, it induces a morphism $Bto C/K(C)$. This takes $1-b$ to zero, so it induces an R-morphism $B/(1-b)to C/K(C)$. As $K(C/K(C))=0$, (1) implies that $K(B/(1-b))$ maps to zero, so is proper.




(6) If $B$ is the R-algebra constructed in (5), then $I=Bx+By$ contains only zero-divisors but $x,yin I$ map to regular elements in $R_z$ and $R_{1-z}$ respectively.




For any $ain I$ there is a $bin B$ with $ab=0$ and $K(B/(1-b))$ proper. In particular, $(1-b)$ must be a proper ideal, so that $bnot=0$ and $a$ is a zero divisor.



Finally, the property $K(B)=0$ implies that $x$ is regular in $B_z$ and $y$ is regular in $B_{1-z}$.

gn.general topology - Do Smash Products and Quotients Commute?

The easiest way I know to say what is going on is to resort to looking at
"products" of pairs:
$$
(X, A) times (Y, B) = ( Xtimes Y , Atimes Y cup Xtimes B).
$$
The point of this notation is that the functor $(X, A) mapsto (X/A, *)$
carries $(X, A) times (Y, B)$ to $X/A wedge Y/B$. We can iterate this procedure,
and I'll write $T^n(Y,X)$ for the subspace of $Y^n$
satisfying
$$
(Y, X)^n = ( Y^n, T^n(Y, X)).
$$
Thus $(Y/X)^{wedge n} = Y^n /T^n(Y,X)$.



You can easily check that
$$
T^n( Y, X) = lbrace (y_1, ldots, y_n) mid y_i in X mbox{for at least one $i$}rbrace.
$$



On the other hand $Y^{wedge n}/X^{wedge n}$ is the quotient
of $Y^n$ by the subspace
$$
T^n(Y,*) cup X^n,
$$
which is different (unless $X = *$).

Friday, 18 July 2014

lo.logic - Graph properties and FOL

I think none of the two answers so far really addresses the edited version of the question
which deals with infinite families of first order sentences.
The following graph properties (among others) can be expressed by infinitely many first order sentences, but not by finitely many (note that finitely many is equivalent to a single one):



(1) The graph is infinite: Take the collection of all sentences $phi_n$ where $phi_n$ states that the graph has at least $n$ vertices. A compactness argument shows that this is not finitely axiomatizable. (This is Kaveh's comment.)



(2) The graph is bipartite. Take the collection of sentences $psi_n$ where $psi_n$ says that the graph has no cycles of length $2n+1$. (This is Dave Marker's comment.)



It is possible to construct infinitely many different properties of graphs or directed graphs of this nature. Just take the property saying that for a fixed finite graph $H$,
the graph $G$ contains infinitely many disjoint copies of $H$.



It is worth pointing out that "the graph is finite" i.e., the opposite of (1), cannot be expressed by even infinitely many first order sentences.
(Given a first order theory $Phi$ that is satisfied by all finite graphs, add infinitely many constants to your vocabulary. Then, by compactness, the theory consisting of $Phi$ together with infinitely many sentences saying that the constants are pairwise distinct has a model, an infinite graph satisfying $Phi$.)



Similarly, not being bipartite is not axiomatizable by a first order theory
(take an ultraproduct of the graphs $C_{2n+1}$, $ninmathbb N$, where $C_{2n+1}$ is the cycle of length $2n+1$).



Being connected is also not expressible even by an infinite first order theory.
I currently don't know whether non-connectedness is first order axiomatizable.
I asked this here.

gravity - Is there a radius inside of which objects are (doppler) blue shifted?

No, there is no spherical volume of space within which objects all tend to approach us and thereby show blue-shift. The fact that the Andromeda galaxy is blue-shifted is because at these relatively small distances the cosmological red-shift is much smaller than further away. Andromeda's own (non-cosmological speed) is larger then the cosmological speed away from us and it happens to move towards us and is therefore blue-shifted. It could also have been moving away from us, and be red-shifted as a result, but that red-shift would not be cosmological.



I'd expect that the chance that a nearby (Local Group) galaxy is either moving away from us or moving towards us is 50%.



Furthermore, the Andromeda galaxy, and all other galaxies belonging to the Local Group, are gravitationally bound and would therefore not show any cosmological red-shift anyhow.



NB. The Andromeda Galaxy is actually 2.5 million light years away from us.

Thursday, 17 July 2014

deep sky observing - What is the line of light in galaxy cluster MACS J1206?

I have highlighted the feature I believe you are talking about. It looks like a galaxy that has been repeated (smaller green oval). There are several foreground objects that appear to be causing additional gravitational lensing (red arrows). Due to the shapes of the objects causing the lensing effect, the apparent image can be affected in many different ways, such a image duplication (multiple identical images of a galaxy, etc.), magnification, distortion, etc. The universe is massive and three-dimensional and we still don't understand that much of it yet.



enter image description here

nt.number theory - Notation/name for "Artin-Schreier roots"?

If x is an element of a field K and n is a positive integer, we have both a symbol and a name for a root of the polynomial t^n - x = 0: we denote it by x^{1/n} and call it an nth root of x.



Of course nth roots play a vital role in field theory, e.g. in the characterization of solvable extensions in characteristic 0. However, in characteristic p > 0, the extraction of a p-power root is a much different business: it gives rise to purely inseparable extensions, not composition factors of solvable Galois extensions.



To repair the characterization of solvable extensions in characteristic p as those being attainable as a tower of "radical" extensions, one needs to include the operation of taking roots of Artin-Schreier polynomials: t^q - t - x = 0, for q = p^a a power of the characteristic.



Finally my question: do we have a name for an element t solving the equation t^q - t = x and/or a special notation for it? I do not know one. Similarly, whereas classically we often speak of x as being "an nth power", in this case I find myself writing "x is in the image of the Artin-Schreier isogeny rho". Is there something better than this?

Tuesday, 15 July 2014

pr.probability - Correspondence between Viterbi algorithm and Smith-Waterman

A HMM for local alignment is shown at



http://books.google.com/books?id=R5P2GlJvigQC&pg=PA86



EDIT It's been 7 or 8 years since I really understood this stuff, but I have some old MATLAB code that implements Smith-Waterman. Since a cursory search doesn't show any such code, I figured I'd post it here. Although I don't think this is what you're really asking for, perhaps it will help you (or someone else).



function y=smithwatbl50lin(D1,D2)
% Smith-Waterman alignment of two ssDNAs
% (OR RNAS: U is equivalent to T here)
% with 5-3 ([uppercase] char) strand D*
% Uses the BLOSUM50 matrix and linear gap penalty.
% Alignment is done randomly (in that
% the traceback goes uniformly at random
% when it's got more than one possibility)

% As usual with my code, there's little room for error...

D1(find(D1=='U'))='T';
D2(find(D2=='U'))='T';

% BLOSUM50 matrix (use symmetry to minimize number of entries typed by hand)
bl50(1,1:20) = [5 -2 -1 -2 -1 -1 -1 0 -2 -1 -2 -1 -1 -3 -1 1 0 -3 -2 0];
bl50(2,2:20) = [7 -1 -2 -4 1 0 -3 0 -4 -3 3 -2 -3 -3 -1 -1 -3 -1 -3];
bl50(3,3:20) = [7 2 -2 0 0 0 1 -3 -4 0 -2 -4 -2 1 0 -4 -2 -3];
bl50(4,4:20) = [8 -4 0 2 -1 -1 -4 -4 -1 -4 -5 -1 0 -1 -5 -3 -4];
bl50(5,5:20) = [13 -3 -3 -3 -3 -2 -2 -3 -2 -2 -4 -1 -1 -5 -3 -1];
bl50(6,6:20) = [7 2 -2 1 -3 -2 2 0 -4 -1 0 -1 -1 -1 -3];
bl50(7,7:20) = [6 -3 0 -4 -3 1 -2 -3 -1 -1 -1 -3 -2 -3];
bl50(8,8:20) = [8 -2 -4 -4 -2 -3 -4 -2 0 -2 -3 -3 -4];
bl50(9,9:20) = [10 -4 -3 0 -1 -1 -2 -1 -2 -3 2 -4];
bl50(10,10:20)=[5 2 -3 2 0 -3 -3 -1 -3 -1 4];
bl50(11,11:20)=[5 -3 3 1 -4 -3 -1 -2 -1 1];
bl50(12,12:20)=[6 -2 -4 -1 0 -1 -3 -2 -3];
bl50(13,13:20)=[7 0 -3 -2 -1 -1 0 1];
bl50(14,14:20)=[8 -4 3 -2 1 4 -1];
bl50(15,15:20)=[10 -1 -1 -4 -3 -3];
bl50(16,16:20)=[5 2 -4 -2 -2];
bl50(17,17:20)=[5 -3 -2 0];
bl50(18,18:20)=[15 2 -3];
bl50(19,19:20)=[8 -1];
bl50(20,20)=5;
bl50=bl50+triu(bl50,1)';

% linear gap penalty
d=8;

% get integral character arrays
E1(D1=='A')='0';
E1(D1=='C')='1';
E1(D1=='G')='2';
E1(D1=='T')='3';
E2(D2=='A')='0';
E2(D2=='C')='1';
E2(D2=='G')='2';
E2(D2=='T')='3';

r=length(E1);
c=length(E2);

% translate to codons, dropping any possible end garbage
E1=E1(1:3*floor(r/3));
E2=E2(1:3*floor(c/3));
% Amino acid hash assigments
% A R N D C Q E G H I L K M F P S T W Y V
% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
% Term --> 21
% List codons in alphabetical order (which equates to
% setting A=0, C=1, G=2, T=3 as we do above and using numerical order)
% and then hash to get
aha=[12 3 12 3 17 17 17 17 2 16 2 16 10 10 13 10];
ahc=[6 9 6 9 15 15 15 15 2 2 2 2 11 11 11 11];
ahg=[7 4 7 4 1 1 1 1 8 8 8 8 20 20 20 20];
aht=[21 19 21 19 16 16 16 16 21 5 18 5 11 14 11 14];
aminohash=[aha ahc ahg aht];
% NB. This will keep us from using zillions (ok, 64) if statements
% Now we're making polypeptides, terminating at the first 21 found
for codon=1:floor(r/3)
ah=aminohash( base2dec(E1(3*(codon-1)+1:3*codon),4) + 1);
if ah==21
break;
end
PP1(codon)=ah;
end
for codon=1:floor(c/3)
ah=aminohash( base2dec(E2(3*(codon-1)+1:3*codon),4) + 1);
if ah==21
break;
end
PP2(codon)=ah;
end

LPP1=length(PP1);
LPP2=length(PP2);

F=zeros(LPP1+1,LPP2+1);
traceback=zeros(LPP1+1,LPP2+1);
% raster-fill F...
for i=2:LPP1+1
for j=2:LPP2+1
temp1=F(i,j-1)-d; % left
temp2=F(i-1,j)-d; % up
temp3=F(i-1,j-1)+bl50(PP1(i-1),PP2(j-1)); % left and up
temp{i,j}=[0 temp1 temp2 temp3];
F(i,j)=max(temp{i,j});
temp4=find(temp{i,j}==F(i,j));
traceback(i,j)=temp4(ceil(rand*length(temp4)));
clear temp4;
end
end


iah='ARNDCQEGHILKMFPSTWYV'; % inverse amino hash string

% Find the biggest score entry. If several than pick one uniformly at random
mf=max(max(F));
tempf=find(F==mf);
[mfi,mfj]=ind2sub(size(F),tempf(ceil(rand*length(tempf))));

% in the end we'll flip alignment--REMEMBER!
backtrace=[mfi,mfj];
if F(backtrace(1),backtrace(2))==0
'no local alignment'
break;
elseif traceback(backtrace(1),backtrace(2))-1==3
alignment(:,1)=[iah(PP1(backtrace(1)-1));' ';iah(PP2(backtrace(2)-1))];
backtrace=backtrace-[1 1];
elseif traceback(backtrace(1),backtrace(2))-1==2
alignment(:,1)=[iah(PP1(backtrace(1)-1));' ';'-'];
backtrace=backtrace-[1 0];
elseif traceback(backtrace(1),backtrace(2))-1==1
alignment(:,1)=['-';' ';iah(PP2(backtrace(2)-1))];
backtrace=backtrace-[0 1];
end

k=1;
while max(backtrace>1)
k=k+1;
if F(backtrace(1),backtrace(2))==0
break;
elseif traceback(backtrace(1),backtrace(2))-1==3
alignment(:,k)=[iah(PP1(backtrace(1)-1));' ';iah(PP2(backtrace(2)-1))];
backtrace=backtrace-[1 1];
elseif traceback(backtrace(1),backtrace(2))-1==2
alignment(:,k)=[iah(PP1(backtrace(1)-1));' ';'-'];
backtrace=backtrace-[1 0];
elseif traceback(backtrace(1),backtrace(2))-1==1
alignment(:,k)=['-';' ';iah(PP2(backtrace(2)-1))];
backtrace=backtrace-[0 1];
else
break;
end
end

alignment=fliplr(alignment);

la=size(alignment,2);
for i=1:la
if alignment(1,i)==alignment(3,i)
alignment(2,i)=alignment(1,i);
elseif bl50(min(find(iah==alignment(1,i))),min(find(iah==alignment(3,i))))>0
alignment(2,i)='+';
else
alignment(2,i)=' ';
end
end

y=alignment;

lo.logic - Goedelizability and decidability of a property of Peano formulas

Francois has fully answered the question about the formulas phi(x) that PA proves to define a finite set, and I agree completely with what he said.



It is also natural, however, to consider which formulas phi(x) define a finite set in the standard model of arithmetic.



Here, the situation is even worse. The set of (codes for) formulas phi(x) that define a finite set in the standard model is not decidable, not enumerable, not co-enumerable, not computable from the halting problem nor computable even from finitely many iterations of the halting problem. This set is not definable by any arithmetic formula. To see this, suppose that it were. Suppose there were a formula F, such that F('phi(x)') was true iff phi(x) defined a finite set.
Note that as a special case, when phi(x) is a sentence whose truth does not depend on x, then the set { n | phi(n) is true in N } is either everything or empty, depending on whether the sentence is true or false in N, the standard model. Thus, for a sentence phi, the formula ¬F('phi') is true if and only if phi is true.
Thus, F provides a definable truth predicate, but this is exactly ruled out by Tarski's theorem on the non-definability of truth. The proof of this is to use the fixed-point lemma to find a sentence psi such that PA proves (F('psi') iff psi). Thus, psi asserts "I am not true", and this easily gives a contradiction.

Monday, 14 July 2014

distances - If we count Avogadro's Number of stars that are closest to Earth, how big that space would be?

I posted this as a question to Scientific Imagination (Area 51 proposal) a while ago. And it was suggested that this question "is perfectly acceptable on Astronomy SE", since "it's about stellar census".



I think that the answer to this question will be useful as a scale to appreciate the size of the universe.



So, If we count Avogadro's Number of stars that are closest to Earth, how big that space would be? And how many galaxies/clusters will be engulfed in that space? And for a start, let's just consider only three dimensions.

ag.algebraic geometry - Is there triangulated category version of Barr-Beck's theorem?

There isn't a descent theory for derived categories per se - one can't glue objects in the derived category of a cover together to define an object in the base. (Trying to apply the usual Barr-Beck to the underlying plain category doesn't help.)



But I think the right answer to your question is to use an enriched version of triangulated categories (differential graded or $A_infty$ or stable $infty$-categories), for which there is a beautiful Barr-Beck and descent theory, due to Jacob Lurie. (This is discussed at length in the n-lab I believe, and came up recently on the n-category cafe (where I wrote basically the same comment here..)
This is proved in DAG II: Noncommutative algebra. In the comonadic form it goes like this. Given an adjunction between $infty$-categories (let's call the functors pullback and pushforward, to mimic descent), if we have



  1. pullback is conservative (it respects isomorphisms), and

  2. pullback respects certain limits (namely totalizations of cosimplicial objects,
    which are split after pullback)

then the $infty$-category downstairs is equivalent to comodules over the comonad
(pullback of pushforward). (There's an opposite monadic form as well)
This can be verified in the usual settings where you expect descent to hold.
In other words if you think of derived categories as being refined to $infty$-categories (which have the derived category as their homotopy category), then everything you might want to hold does.



So while derived categories don't form a sheaf (stack), their refinements do:
you can recover a complex (up to quasiisomorphism) from a collection of complexes on a cover, identification on overlaps, coherences on double overlaps, coherences of coherences on triple overlaps etc.
More formally: define a sheaf as a presheaf $F$ which has the property that
for an open cover $Uto X$, defining a Cech simplicial object $U_bullet={Utimes_X Utimes_X Ucdotstimes_X U}$, then $F(X)$ is the totalization of the cosimplicial object $F(U_bullet)$. Then enhanced derived categories form sheaves (in appropriate topologies) as you would expect. This is of course essential to having a good theory of noncommutative algebraic geometry!

orbit - How was Io not torn apart by tidal forces during its formation?

No, it is not just a matter of migration. You need to take into account two facts.



One is that (as experience shows) Io's own gravity is enough to avoid it breaking by tidal forces. It has been like that through all its history: Io could not have been formed if it started aggregating today, but it was formed at the same time Europa and Ganymede did: they three were growing in parallel.



Other is that of the orbital resonances, which makes precisely that orbit with so simple integer number relations with those of Europa and Ganymede a stable one. Io could not have been formed in another place.

Sunday, 13 July 2014

physics - How did we come to the conclusion that light moves as fast as it does?

The very fact that light has finite speed was first discovered in astronomy. See for example the Wikipedia article on Aberration of light. Light which is responsible for this phenomenon travels in the empty space. By the way the difference of speed in the air and in vacuum is small, and the first precise measurement of this speed were made in the air.



To conclude from these measurements what is the speed in vacuum is easy by combining with an astronomical observation. Indeed, the difference of light speed in vacuum and in the air is responsible for the phenomenon which is called (astronomical) refraction. And refraction of the light coming from a star can be very exactly measured. If you know the speed of light in the atmosphere, you can compute the speed in vacuum. This is the general principle.

Saturday, 12 July 2014

nt.number theory - Why does the Riemann zeta function have non-trivial zeros?

Rather than the problem of why the zeta function has non-trivial zeros, let me address
Gowers's question of why the error term in the prime number theorem needs to be large. The short answer that I propose is: because the integers are so well distributed. To make this precise, I shall prove a general result on semigroups, showing that
either the "integers" in the semigroup or the "primes" must be poorly distributed -- one may think of this as an "uncertainty principle" that both primes and integers cannot be simultaneously smoothly behaved. This has the flavor of Beurling generalized primes, but I don't recall this result in the literature; maybe it exists already (indeed it does, see the edit below). Also note that the proof will not make any use
of the functional equation for $zeta(s)$ as this does not exist in a general semigroup.



EDIT: Indeed doing a literature search a few hours after posting this,
I found a paper of Hilberdink http://www.sciencedirect.com/science/article/pii/S0022314X04002069 which proves the
Theorem below, and with a similar method of proof.



Suppose that $1<p_1< p_2 <ldots$ is a sequence of real numbers (the "primes"), and
$1=a_1 le a_2 le ldots$ is the semigroup generated by them (the "integers"). Initially I made the assumption that the "integers" are distinct, and satisfy a mild spacing
condition $a_{n+1}-a_n gg n^{-1}$ (so that in particular there is unique factorization), but this is not necessary. Let $N(x)$ denote the number of
"integers" below $x$, and
$$
P(x) = sum_{p_n^{k} le x} log p_n,
$$
which is the analog of the usual $psi(x)$ (counting prime powers with weight $log p$). Assume that for some $delta>0$
$$
N(x) = Ax +O(x^{frac 12-delta}),
$$
for some non-zero constant $A$, and that
$$
P(x) = x + O(x^{frac 12-delta}).
$$



Theorem. Either the asymptotic formula for $N(x)$ or the asymptotic
formula for $P(x)$ must fail.



Put
$$
zeta_A(s) = sum_{n=1}^{infty} a_n^{-s} = prod_{n} Big(1-frac{1}{p_n^s}Big)^{-1}.
$$
By our assumptions on $N$ and $P$, the sum and product above converge absolutely in Re$(s)>1$. By the assumption on $N(x)$, $zeta_A(s)$ extends to an analytic function in Re$(s)>1/2-delta$ except for a simple pole at $s=1$ with residue $A$. By the assumption on $P$, we see that the logarithmic derivative
$$
-frac{zeta_A^{prime}}{zeta_A}(s) = sum_{n, k} frac{log p_n}{p_n^{ks}}
$$
extends analytically to Re$(s)>1/2-delta$ except for a simple pole at $1$. Thus
$zeta_A(s)$ has no zeros in Re$(s)>1/2-delta$.



New edit: Sketch of a second proof. Adapting the argument that the Riemann hypothesis implies the Lindelof hypothesis (see below), we obtain that $|zeta_A(s)| ll (1+|s|)^{epsilon}$ provided $s$ is not close to the pole at $1$, and that Re$(s)>1/2-delta/2$. From this and a standard contour shift argument we find that for large $N$ and any $t$,
$$
sum_{a_nle N} a_n^{it} = frac{N^{1+it}}{1+it} +O(N^{1/2-delta+epsilon} (1+|t|)^{epsilon}).
$$
What is used here is that we have Lindelof even a little to the left of the half line.



But the above identity can be seen to contradict the Plancherel formula. More precisely, let $T$ be a large power of $N$, and let $Phi$ be a non-negative function supported in $[-1,1]$ with non-negative Fourier transform. Then we see that (discarding all but the diagonal terms)
$$
int_{-infty}^{infty} Big| sum_{a_nle N} a_n^{it}Big|^2 Phi(t/T) dt
ge T{hat Phi}(0) sum_{a_nle N} 1 sim TN {hat Phi}(0).
$$
On the other hand, if we use our identity then the above is seen to be
$$
ll N^2 + T^{1+epsilon} N^{1-2delta}.
$$
This is a contradiction.



Original Proof: Below let's assume always that we are in the region Re$(s)>1/2-delta$, and that
the imaginary part is large so that we are not near the pole at $1$. From
the analytic continuation of $zeta_A$ (using that $N(x)$ is very regular), it follows
that there is an a priori polynomial bound $|zeta_A(s)|ll |s|^{B}$ in the region Re$(s)>1/2-delta/2$. Thus there is a bound for the real part of $log zeta_{A}(s)$, and by the Borel-Caratheodory lemma (standard complex analysis) one can bootstrap this to a bound for $|log zeta_A(s)|$. Then applying the Hadamard three circle theorem to $log zeta_A(s)$ one obtains a much better bound: $|zeta_A(s)| ll |s|^{epsilon}$. This is the usual proof that Riemann implies Lindelof. (At this stage, if we knew a "functional equation" we'd be done, as the usual $zeta(s)$ is large when Re$(s)<1/2$. This point appeared in Matt Young's answers earlier.)



Knowing the Lindelof hypothesis for $zeta_A(s)$ in Re$(s)> 1/2-delta/2$, we can
show the following approximate formula: for $sigma > 1/2- delta/4$
$$
zeta_A(sigma +it) = sum_{a_n le N} a_n^{-sigma-it} + O(|t|^{epsilon} N^{-delta/4}).
$$
The proof is standard; see the penultimate chapter of Titchmarsh for the real $zeta(s)$ where this holds when $sigma$ is strictly bigger than $1/2$, and our stronger result is true because we have Lindelof in a wider region.



Now we are ready to get our contradiction. Consider for large $T$
$$
int_T^{2T} |zeta_A(1/2+it)|^2 dt.
$$
To do this carefully it may be helpful to put in a smooth weight $Phi(t/T)$ above
(but this is not a paper!). Using the approximate formula derived above, and our mild spacing condition $a_{n+1}-a_n gg n^{-1}$,
we may see that for any $T^{epsilon} le N le T^{1/10}$ we have
$$
int_T^{2T} |zeta_A(1/2+it)|^2 dt sim T sum_{a_n le N} a_n^{-1} sim AT log N.
$$
But that's absurd! This completes our sketch proof.

mp.mathematical physics - Where does a math person go to learn statistical mechanics?


So, is there a good resource for statistical mechanics for the mathematically-minded?




If you are looking for a book, the real answer is "not really". As a mathematician masquerading as a physicist (more often than not of a statistical-physical flavor) I have looked long, hard, and often for such a thing. The books cited above are some of the best for what you want (I own or have read at least parts of many of them), but I would not say that any are really good for your purposes.



Many bemoan the lack of The Great Statistical Physics text (and many cite Landau and Lifshitz, or Feynman, or a few other standard references while wishing there was something better), and when it comes to mathematical versions people naturally look to Ruelle. But I would agree that the Minlos book (which I own) is better for an introduction than Ruelle (which I have looked at, but never wanted to buy).



Other useful books not mentioned above are Thompson's Mathematical Statistical Mechanics, Yeomans' Statistical Mechanics of Phase Transitions and Goldenfeld's Lectures On Phase Transitions And The Renormalization Group. None of them are really special, though if I had to recommend one book to you it would be one of these or maybe Minlos.



You might do better in relative terms with quantum statistical mechanics, where some operator algebraists have made some respectable stabs at mathematical treatments that still convey physics. But really that stuff is at a pretty high level (and deriving the KMS condition from the Gibbs postulate in the Heisenberg picture can be done in a few lines) so the benefit is probably marginal at best.

star systems - How do the orbits of Nu Scorpii and AR Cassiopeiae work?

I apologize for the way this question is worded but I don't think I know the proper verbiage. Basically, I want to know how the stars orbit one another in the two septuple star systems. For example, are there two massive stars in the middle that have the five remaining stars orbiting similar to the planets of our system; does one of the orbiting stars have one or more smaller star-"moons;" maybe there are stars trapped in one of the Lagrange Points. I hope that clears up the question.



I am aware that the orbits of these systems may not be known/understood. If that is the case, then I would appreciate the orbital relationships between the stars in the highest-multiple star system. If that is confusing, I mean, if we know the orbits for the stars in a sextuple system; if not then a quintuple... etc.



Finally, as an added bonus, I would appreciate being told the correct verbiage. Thank you, much.

Definition of elementary number theory

Elementary number theory is better defined by its focus of interest than by its methods of proof. For this reason, I rather like to think of it as classical number theory. It deals with integers, rationals, congruences and Diophantine equations within a framework recognizable to eighteenth-century number theorists. Algebraic number theory does not qualify because of its level of abstraction, even though algebraic numbers were sometimes applied to particular problems in number theory before the nineteenth century. Analytic number theory is not only distinguished by the use of complex and harmonic analysis (for many problems these are by no means indispensable), but even more by the modern emphasis on counting the number of solutions to number theoretical problems approximately. In the eighteenth century they also liked to count the number of solutions when they could, but they wanted exact answers, which severely limited the range of counting problems that they could solve. It is true that Dirichlet brought analysis into number theory, but he also counted the number of divisors of integers approximately by averaging, and Gauss before him had done the same for class numbers and genera of binary quadratic forms, as one can see from remarks in article 301 in the Disquisitiones Arithmetica (but he never published his proofs). And Legendre in 1808 published an approximation to the counting function pi(x) of the primes, which was the start of the line of development that led to the Prime Number Theorem (Gauss had also found an approximation, but this was published only in 1863 in his collected works). The systematic acceptance of approximate answers in number theory really is a nineteenth century development. It is obvious from his interests and techniques that Euler could have found many such results if he had wanted. In 1838 when Dirichlet wrote his first paper on the approximate average number of divisors, the cupboard was so bare that he could only cite the remarks of Gauss in article 301 and Stirling's and allied formulas (!) as prior work to motivate his own.



There is a field that was absolutely central to number theory from its earliest days but for which elementary tools no longer suffice by themselves - Diophantine analysis. Actually, this transition of Diophantine analysis from elementary to non-elementary status began in the nineteenth century with the use of algebraic number theory, and gathered force in the twentieth century. But of course, there is also a huge amount of Diophantine analysis by classical techniques from the nineteenth and twentieth centuries.

ag.algebraic geometry - What is the link between sections and sections? (schemes)

Sections to the morphism $X to S$ are more or less $S$-rational points of $X$: if for example $S = Spec(R)$ and $X = Spec(R[x_1, dots, x_n]/(f_1, dots, f_m))$ with polynomials $f_1, dots, f_m in R[x_1, dots, x_n]$, then sections to $X to S$ correspond to points $(a_1, dots, a_n) in R^n$ with $f_i(a_1, dots, a_n) = 0$ for all $i$.



On the other hand, the elements of $mathcal{O}_X(U)$ can be seen as holomorphic functions on $U$.



So the one kind of sections can be seen as "points" of the geometric object, the others can be seen as "functions" on the geometric object.

Friday, 11 July 2014

ag.algebraic geometry - Do coequalizers in RingSpc automatically lead to descent?

I'm currently interested in the following result:



Let $f: X to Y$ be a fpqc morphism of schemes. Then there is an equivalence of categories between quasi-coherent sheaves on $Y$ and "descent data" on $X$. Namely, the second category consists of quasi-coherent sheaves $mathcal{F}$ on $X$ with an isomorphism $p_{1}^*(mathcal{F}) simeq p_2^*(mathcal{F})$, where $p_1, p_2: X times_Y X to X$ are the two projections. Also, a diagram involving an iterated fibered product is required to commute as well (the cocycle condition).



In Demazure-Gabriel's Introduction to Algebraic Geometry and Algebraic Groups, it is proved (under the name ffqc (sic) descent theorem) that the sequence
$$ X times_Y X to^{p_1, p_2} X to Y$$ is a coequalizer in the category of locally ringed spaces under the above hypotheses. If I am not mistaken, this is the same as (or very closely equivalent to) the theorem that says that representable functors are sheaves in the fpqc topology. On the other hand, D-G give a fairly explicit description of the quotient space.



Question: For a coequalizer diagram of (locally) ringed spaces schemes,
$$A rightrightarrows^{f,g} B to C,$$
is there a descent diagram for quasi-coherent sheaves on $A,B,C$? In particular, does the D-G form of the descent theorem directly, by itself, imply the more general one for quasi-coherent sheaves?



My guess is the answer is no. First, I've heard that the coequalizer condition above is actually very weak. So, suppose instead we have that $A rightrightarrows^{f,g} B to C,$ is a coequalizer and any base-change of it is a coequalizer. Does that imply that there is a descent diagram for quasi-coherent sheaves?



My guess is that the answer to the modified question is still no, for the meta-reason that Vistoli in FGA Explained spends much more time on proving descent for quasi-coherent sheaves than proving that the fpqc topology is subcanonical. On the other hand, I'd like to see a counterexample.



(N.B. I initially asked this question on Math.SE. I was advised to re-ask it here.)

co.combinatorics - Lifting matrices mod 2 to integers.

I want to address the weaker question in a more general setting:



Given any matrix over $mathbb{Z}$ with determinant $1$ mod $m$. Is it possible to add multiples of $m$ to each entry to get a matrix with determinant one?



Call a matrix transformable, iff this is possible. Given any $A$ matrix over $mathbb{Z}$. Consider the image of the induced map $mathbb{Z}^nrightarrow mathbb{Z}^n$. It is a submodule of $mathbb{Z}^n$.
For any such submodule with rank $k$ it is always possible to find a basis of $b_1,ldots,b_n$ of $mathbb{Z}^n$ and numbers $r_1,ldots,r_k$, such that $r_i|r_{i+1}$ for $i=1ldots k-1$ and $r_1b_1,ldots r_kb_k$ is a basis of the given submodule. The proof of this is roughly the same as the proof of the structure theorem of finitely generated abelian groups.



This tells us, that we can write our matrix $A$ in the form $A=BDC$, where $B$ and $C$ are invertible and $D$ is a diagonal matrix, with the entries $r_1,ldots ,r_n$ from above. As the determinant is not zero, the image must have full rank and hence $k=n$.



It is easy to see, that left (right) multiplication with invertible matrices doesn't change the transformability. So we might assume, that $A$ has the given form. We will reduce inductively the number of non-one diagonal entries of $A$ without changing the determinant modulo $m$. This number is the length of $mathbb{Z}^n/Im(A)=bigoplus_{i=1}^nmathbb{Z}/r_i$.



Suppose there is a non-one diagonal entry $r_i$. Then there has to be a second one $r_{i'}$, as the product of all diagonal entries is $1$ mod $m$. Otherwise this is the last non-one diagonal entry and it has to be $1$ mod $m$ and we can transform the matrix into the identity matrix.



Now add $m$ to $r_i$ and it will become coprime to the other non-one entry $r_{i'}$, as $r_i|r_{i'};,;gcd(m,r_i)=1=gcd(m,r_i')$. Hence we get $mathbb{Z}/(r_i+m)oplusmathbb{Z}/r_{i'}congmathbb{Z}/((r_i+m)r_{i'})$
Call the resulting matrix $A'$ and observe that the length of $mathbb{Z}^n/Im(A')$ is one smaller, than the length of $mathbb{Z}^n/Im(A)$. Then we can again find the normal form for $A'$ and repeat this process until we end up with the identity matrix. Hence every matrix with determinant $1$ mod $m$ is transformable.

Wednesday, 9 July 2014

rotation - What is the axial tilt of a planet measured relative to?

I am very much a beginner on the astronomy front but I understand about planets having different axial tilts, hence why Venus turns the opposite direction from the Earth and Uranus turns sideways.



However, I am confused as to what the axial tilt bearing is measured from. For example, if all planets were in a perfect line from the sun, would they spin perfect in alignment on their own various axial tilts? Would some still be 'off-set' by a few degrees to one side or the other?



Do we actually know please?

Tuesday, 8 July 2014

amateur observing - How to see Saturn's rings through a pair of binoculars?

You will not be able to see Saturn's rings directly with a 15x70 telescope, but you will be able to notice that Saturn appears elliptical, not round. This is due to the shape of the rings around the planet, but you will not be able to resolve the hole between the rings and the planet (much less the Cassini gap).



Just as an impractical suggestion in the interest of giving to you a more complete answer, you might be able to mount a camera and use image stacking to see a smidge of ring and differentiate it from the sphere of the planet. If you do go this route, please post online about your setup and results, whether it succeeds or not.

co.combinatorics - How many n×n (0,1)-matrices with row/column sum 4 and trace 0 are there?

This is somewhat of an ill-posed question, since "closed" is a very vague term. See: H. S. Wilf, `What is an answer?', Amer. Math. Monthly 89 (1982), 289-292.



A $k times n$ Latin rectangle is an $k times n$ matrix with symbols from $mathbb{Z}_n$ in which each symbol occurs exactly once in each row and at most once in each column. A Latin rectangle is called normalised if its first row is $(0,1,ldots,n-1)$. Given a $5 times n$ normalised Latin rectangle $L=(l_{ij})$ we can construct a $n times n$ $(0,1)$-matrix $T=(t_{ij})$ by setting $t_{ij}=1$ if symbol $j$ appears in column $i$ of $L$ (and $t_{ij}=0$ otherwise). We see that every row and every column of $T$ contains exactly $5$ copies of the symbol $1$. Moreover, since $L$ is normalised, $t_{ii}=1$ for all $i in mathbb{Z}_n$. Let $T'$ be formed from $T$ by setting main diagonal of $T$ to $(0,0,ldots,0)$. Then $T'$ contains row sum $4$ and column sum $4$ and trace $0$.



Let $X_{4,n}$ be the number of $n times n$ $(0,1)$-matrices with column sum 4 and row sum 4 and trace $0$. It is a well-known result that each $(0,1)$-matrix counted by $X_{4,n}$ can be formed as $T'$ for some normalised $5 times n$ Latin rectangle (from Hall's Marriage Theorem). We conclude that $X_{4,n} leq K_{5,n}$, where $K_{5,n}$ is the number of normalised $5 times n$ Latin rectangles. Erdos and Kaplansky showed that $K_{5,n} sim n!^4 e^{-10}$ (although you can expect $X_{4,n}$ to be much less than this).



There are several published formulae for the number of Latin rectangles (in fact, I have a paper on this topic). There's a good chance that modifying one of these formulae would yield a formula for $X_{4,n}$. One good example is by Doyle: http://arxiv.org/abs/math/0703896.



In addition, there is a easy graph theoretic interpretation of $X_{4,n}$. Let $G$ be the complete bipartite graph with bipartition ${v_1,v_2,ldots,v_n} cup {u_1,u_2,ldots,u_n}$. Then $X_{4,n}$ is the number of $4$-regular subgraphs $H$ of $G$ minus the one-factor ${v_i u_i}_{1 leq i leq n}$ (on the same vertex set as $G$). Given such a subgraph $H$, let $X=(x_{ij})$ be defined by $x_{ij}=1$ if $v_i$ is adjacent to $u_j$ in $H$.

fa.functional analysis - spectra of sums and products in (Banach) algebras [was: Spectrum in Banach Algebra]

In general Banach algebras the spectral radius is neither subadditive nor submultiplicative; in particular, neither of the two properties you mention holds.



$2times 2$ (real or complex) matrices should suffice to give examples, so this is not to do with any subtleties of infinite-dimensional algebras.



For example, take $ a= left(begin{matrix} 0 & 1 \\ 0 & 0 end{matrix} right) $.
Note that this is nilpotent, so the only point in the spectrum is zero, and hence $sigma(a)sigma(b)= {0}$ for any other matrix $b$. On the other hand, we can find $b$ for which $ab$ is not nilpotent, so that $sigma(ab)notsubseteq {0}$. A simple choice which works is $b=left(begin{matrix} 0 & 0 \\ 1 & 0 end{matrix} right)$, since then $ab$ is a non-zero projection (=idempotent) and so contains $1$ in its spectrum.



The same pair also works as a counter-example for the "additive question". For since $sigma(a)=sigma(b)={0}$, we have $sigma(a)+sigma(b)={0}$. On the other hand, $a+b$ is a reflection and hence its spectrum is ${-1,1}$




(edited 11-02-10 for a couple of typos/omissions)



Jonas Meyer points out in comments that one can pose the following converse question: let $A$ be a Banach algebra with identity, and suppose that we have



($*$) $sigma(a+b)subseteqsigma(a)+sigma(b)$ and $sigma(a)sigma(b)subseteqsigma(ab)$ for all $a,bin A$.



Must $A$ be commutative?



As Jonas also pointed out in comments, the answer is in general `no': the algebra of strictly upper-triangular matrices (or, more precisely, the algebra of scalar+strictly upper triangular $mtimes m$ matrices, for some fixed $m$) gives a counterexample, at least when $mgeq 3$.



A more careful version of this argument shows, I think, that if $A$ is a finite-dimensional algebra with identity, such that $A/{rm Rad}(A)$ is commutative, then $A$ will satisfy condition $(*)$. I suspect that the same might be true for any unital Banach algebra that is commutative modulo its radical, i.e. that the finite-dimensional hypothesis is unnecessary; but at present I'm a bit too tired to check this properly.



We could therefore modify the question yet further, and ask if a Banach algebra that satisfies condition $(*)$ must be commutative modulo its radical. The answer turns out to be yes, after a wander down memory lane and a forage on MathSciNet:



MR0461139 (57 #1124)
J. Zemánek. Spectral radius characterizations of commutativity in Banach algebras.
Studia Math. 61 (1977), no. 3, 257--268.



The MR is short and informative enough to give in full, for those without access:




It is standard that the spectral radius is subadditive and submultiplicative on any commutative complex Banach algebra. The author proves that, for a complex Banach algebra $A$, the following three conditions are equivalent: (1) the spectral radius is sub-additive on $A$, (2) the spectral radius is submultiplicative on $A$, (3) $A$ is commutative modulo its radical. Some applications of this result to other problems in Banach algebras are given, along with references to a number of related papers.

linear algebra - an exercise on integrality of characteristic polynomials

Suppose A is a matrix with coefficient in $Q_{ell}$, and all the coefficients of its char. polynomial are in $Z$ (thus an integral polynomial). Prove that the char. polynomial of $A^n$ is also integral. (This question probably has nothing to do with the base field $Q_{ell}$)



This question actually comes from a remark in Serre's book "abelian l-adic representations", so allow me to tag it with "number theory"...

Why isn't the moon dark in a solar eclipse?

Light refracts in the atmosphere, causing the light to scatter in many directions. This light scattering is what causes our day to be blue, our sunsets to be their red-orange-violetish colors and allows us to see the light of the Sun shortly before sunrise and after sunset. As this only begins happening in any significant amount after the light enters Earth's atmosphere, there should not be any notable difference in the color or intensity of light between you and the moon and any other part of the sky at a relative angle from the Sun.



On the other hand, the foreground does not have direct light on the photographer's side (because it is nearly between the sun and the photographer) and there is a very limited amount of atmosphere to scatter light. As a result, less light will be coming to the camera from that direction.

Monday, 7 July 2014

gr.group theory - Hopfian and Co-Hopfian groups (examples)

Before going to examples, here are some general comments:



a) Proving that a finitely generated group is Hopfian is usually pretty hard unless the group is residually finite e.g. finitely generated subgroups of $GL(n,mathbb R)$ are residually finite, hence Hopfian by an old result of Mal'cev.



b) A common method in proving that a group $G$ is co-Hopfian is to use an invariant of $G$ that is multiplicative under passing to finite index subgroups. If $G$ has a nonzero such invariant, then $G$ has no finite index subgroups isomorphic to itself.
For example, if $G$ is the fundamental group of a finite aspherical CW-complex of nonzero Euler characteristic, then $G$ has no finite index subgroups isomorphic to itself.



c) If $G$ is the fundamental group of a closed aspherical manifold, then $G$ has no infinite index subgroups isomorphic to $G$ (look at top-dimensional homology).



d) Euler characteristic, signature, $L^2$-Betti numbers, simplicial volume are are multiplicative under finite covers of closed aspherical manifolds, so if $G$ is the fundamental group of a closed aspherical manifold with say nonzero signature, then $G$ is co-Hopfian.



Here are some specific examples to add to Richard's example of one-ended torsion free hyperbolic groups. All of the following groups are linear, hence residually finite, hence Hopfian.



  1. The fundamental groups of closed locally symmetric spaces of nonpositive curvature without local flat factors are co-Hopfian because they have nonzero simplicial volume
    thanks to a result of Lafont-Schmidt.


  2. If memory serves me, it is possible to figure out which geometric 3-manifold groups are co-Hopfian. For example, the $SL_2(mathbb R)$-Seifert fibered spaces have a certain invariant detecting volume of the base $2$-orbifold which is multiplicative under finite covers. Check papers of Pierre Derbez in arXiv.


  3. Fundamental groups of some nilmanifolds are co-Hopfian, see my paper here.


  4. Delzant-Potyagalo classified co-Hopfian Kleinian groups (in real hyperbolic space of any dimension). See
    here.


Computing the Sérsic profile of a galaxy from jpg images

I am trying to calculate the Sérsic profile of various galaxies from the SDSS based on the images provided by the galaxy zoo site. I am doing this as part of a kaggle competition on using machine learning to predict galaxy morphology. I have no chance of a high rank in this competition so I'm not hesitant to ask for help.



I used the R contourLines function to identify the isophotes of the galaxy and then fit ellipses to each isophote. This seemed to work well, the isophotes are almost always well fit by the ellipses and the ellipses are nearly concentric. Then letting I be the pixel intensity of an isophote and R be the length of the semi-major axis of the corresponding ellipse, I need to fit an equation of the form



log I(R) = log I_0 - k * R^(1/n)


The simple approach seemed to be to take the log of both sides and use OLS regression, so I fit a linear model in R of the form



log(log(I)) ~ log(R)


The resulting graphs showed a good fit but the resulting Sérsic indices n are almost always less than one and never as big as two. This doesn't seem right since indices of 4 or higher seem common in my reading. I don't get anywhere near 4 for an image of M87.



Possibly taking log log flattens things out too much and the index is not responsive enough. I tried using nls to work with just the log but it didn't move the indices much.



Is there any standard software or algorithm for computing the Sérsic index from an image? Are there reference images I can work from that would let me check if my algorithm is reasonable? Any recommendations on how to proceed would be welcome.



UPDATE: I have found the programs GALFIT and GIM2D which look like they might be useful. Any other software that is commonly used for this?

dg.differential geometry - Are rotations generated by translations, scalar multiplications and inversions?

Your question is not clear: what do you mean by "do we need" ? Moreover, inversions are not usually considered as Moebius transformation (they do not preserve orientation), and finally you do not precise what scalars you consider (real or complex).



Yet, I guess that one of the two following facts should answer your question: multiplication by a complex scalar is a similarity (a rotation if the number has modulus $1$), and the composition of two reflexions (which are inversions with respect to lines, which are circles containing the point at infinity) is a rotation.

Sunday, 6 July 2014

gravity - How does light affect the universe?

Old question, but I'll address something that hasn't been brought up by the previous answers.



Photons $simeq$ CMB photons (to first order)



As the others has already said: yes, light has energy and hence it gravitates. The bulk of photons that permeate the Universe isn't of stellar origin, though, but is in fact the cosmic microwave background, the energy density of which several orders of magnitude larger than other photons, as seen in the graph from this answer to "Number density of CMB photons". In terms of number density, there are 4-500 photons per cm$^3$.



Space is big and isotropic



Since CMB photons are isotropically distributed, the ever-so-small radiation pressure is equal in all directions, and hence cancels out. And although we're all the time bombarded by both CMB photons and stellar photons, space is so mind-bogglingly big (D. Adams, 1978) that if you consider a random photon in the Universe, the probability of it hitting anything at all is negligible. Roughly 90% of the CMB photons have traveled for 13.8 billion years without hitting anything; the remaining 10% interacted with the free electrons that were released after reionization, but weren't absorbed, just polarized, and by far most of these interactions took place shortly after reionization; by now, the Universe has simply expanded too much.



Photons are redshifted



Although there is energy in photons, and hence they add to gravitation, first of all they're homogeneously distributed in the Universe (and thus pulls equally in all directions), and second their energy density is negligible compared to baryons ("normal matter" like gas, stars, and planets), dark matter, and dark energy. In fact, their relative densities are ${rho_mathrm{bar},rho_mathrm{DM},rho_mathrm{DE},rho_mathrm{phot}}/rho_mathrm{total} = {0.05,0.27,0.68,10^{-4}}$. But this was not always the case. As the Universe expands and new space is created, the density of matter decreases as $1/a^3$, where $a$ is the scale factor ("size") of the Universe. The same is true for photons, but since additionally they're redshifted proportionally to $a$, their energy density decreases as $1/a^4$. That means that as you go back in time, the relative contribution of photons to the energy budget increases, and in fact until the Universe was 47,000 years old, its dynamics was dominated by radiation.

ac.commutative algebra - What does primary mean geometrically?

As Harry suggests in his answer, it is probably more intuitive to work with associated
primes, rather than the slightly older language of primary decompositions.



If $I$ is an ideal in $A$, an associated prime of $A/I$ is a prime ideal of $A$ which
is the full annihilator in $A$ of some element of $A/I$. A key fact is that for any
element $x$ of $A/I$, the annihilator of $x$ in $A$ is contained in an associated prime.



The associated primes are precisely the primes that contribute to the primary decomposition
of $I$. Geometrically, $wp$ is an associated prime of $A/I$ if there is a section
of the structure sheaf of Spec $A/I$ that is supported on the irreducible closed set $V(wp)$. E.g. in the example given in Cam's answer, the function $x^2 - x$ is not identically
zero on $X:=$ Spec ${mathbb C}[x,y]/(x y, x^3-x^2, x^2 y - xy),$ but it is annihilated
by $(x,y)$, and so is supported at the origin (if we restrict it to the complement of
$(0,0)$ in $X$ then it becomes zero).



The non-minimal primes of $I$ that play a role in the primary decomposition of $I$
(i.e. appear as associated primes of $A/I$) are the generic points of the so-called
embedded components of Spec $A/I$: they are irreducible closed subset of Spec $A/I$
that are not irreducible components, but which are the support of certain sections
of the structure sheaf.



An important point is that if $I$ is radical, so that $A/I$ is reduced, then
there are no embedded components: the only associated primes are the minimal primes
(for the primary decomposition of $I$ is then very simple, as noted in the question:
$I$ is just the intersection of its minimal primes).



There is a nice criterion for a Noetherian ring to be reduced: Noetherian $A$ is
reduced if and only if $A$ satisfies $R_0$ and $S_1$, i.e. is generically reduced,
and has no non-minimal associated primes. Geometrically, and applied to $A/I$
rather than $A$, this says that if $A/I$ is generically reduced, then the embedded
components are precisely the irreducible closed subsets of Spec $A/I$ over which
the nilpotent sections of the structure sheaf are supported. This may help
with your ``nilpotentification'' mental image.

pr.probability - How to choose $L$ size-$m$ subsets of ${1,ldots,n}$ to maximize expected max overlap with another randomly chosen subset?

GIVEN: Positive integers $n,m,L$ and probabilities $p_1, p_2, ldots, p_n$.



GOAL: Choose $L$ size-$m$ subsets $S_1, S_2, ldots, S_L$ of ${1,2,ldots,n}$ to maximize $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$, where the expectation is taken over (all $2^n$) $S subseteq {1,2,ldots,n}$ using the pdf $f$ defined as
[ f(S) = left( prod_{i in S} p_i right) left( prod_{i notin S} (1-p_i) right) . ]



(Equivalently, $S$ is chosen by examining each $i in {1,2,ldots,n}$ independently, choosing $i$ to be in $S$ with probability $p_i$.)



Note: the problem is trivial unless $m < n$ and $L < binom{n}{m}$.



A nice, closed-form expression for $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$ (in terms of $n,m,L$ and the $p_i$ and the $S_ell$) would be a good start.



EDIT:



WLOG, $p_1 geq p_2 geq cdots geq p_n$.



Special cases:



  • $L=1$: $S_1 = { 1, 2, ldots, m }$ maximizes $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$.


  • $m=1$: Denote by $x_i$ the single element of $S_i$ (i.e., $S_i = { x_i }$) and $X = { x_1, x_2, ldots, x_L } = bigcup S_i$. $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ] = sum_{S cap X neq emptyset} 1 cdot f(S) = 1 - sum_{S subseteq { 1, 2, ldots, n } setminus X} f(S) = 1 - prod_{i in X} (1-p_i)$, which is maximized by the greedy solution of $S_i = { i }$ (for $1 leq i leq L$).


Here's how I examined a few more special cases:



This example ($(n,m,L)=(4,1,2)$) falls under the already discussed $m=1$ case, but I present it to hopefully make the presentation of subsequent examples clearer:



$binom{binom{n}{m}}{L} = binom{binom{4}{1}}{2} = 6$ possibilities for $(S_1,S_2)$:



1100 1ooo,o1oo,  p1+p2        -  p1p2
1010 1ooo,oo1o, p1+ p3 - p1p3
1001 1ooo,ooo1, p1+ p4 - p1p4
0110 o1oo,oo1o, p2+p3 - p2p3
0101 o1oo,ooo1, p2+ p4 - p2p4
0011 oo1o,ooo1, p3+p4 - p3p4


  • Column 1: the sum of the subsequent $L=2$ columns (I used "o" for "0" in those for readability)

  • Column 2: represents $S_1$ (the 1st bit is 1 iff $1 in S$ (0 otherwise), the 2nd bit is 1 iff $2 in S$, etc.)

  • Column 3: represents $S_2$

  • Column 4: $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$ (formatted to put the degree-1 terms all together first, then the degree-2 terms, etc., with spacing so you can look up and down and see which other rows share those terms).

$binom{binom{n}{m}}{L} = binom{binom{4}{2}}{2} = 15$ possibilities for $(S_1,S_2)$:



2110 11oo,1o1o, p1+p2+p3     -                 p2p3
2101 11oo,1oo1, p1+p2+ p4 - p2p4
1210 11oo,o11o, p1+p2+p3 - p1p3
1201 11oo,o1o1, p1+p2+ p4 - p1p4
2011 1o1o,1oo1, p1+ p3+p4 - p3p4
1120 1o1o,o11o, p1+p2+p3 - p1p2
1021 1o1o,oo11, p1+ p3+p4 - p1p4
1102 1oo1,o1o1, p1+p2+ p4 - p1p2
1012 1oo1,oo11, p1+ p3+p4 - p1p3
0211 011o,o1o1, p2+p3+p4 - p3p4
0121 011o,oo11, p2+p3+p4 - p2p4
0112 01o1,oo11, p2+p3+p4 - p2p3

1111 11oo,oo11, p1+p2+p3+p4 - p1p3-p1p4-p2p3-p2p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4
1111 1o1o,o1o1, p1+p2+p3+p4 - p1p2- p1p4-p2p3- p3p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4
1111 1oo1,o11o, p1+p2+p3+p4 - p1p2-p1p3- p2p4-p3p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4


I separated the rows into two groups; two rows are in the same group if their first-column entries are permutations of each other.



Notes: the coefficient of 2 in the degree-4 terms of the last three rows seems interesting to me. The sign of a term is the negation of $(-1)$ to the degree of the term.



$binom{binom{n}{m}}{L} = binom{binom{4}{2}}{3} = 20$ possibilities for $(S_1,S_2,S_3)$:



3111  11oo,1o1o,1oo1,  p1+p2+p3+p4   -                  p2p3-p2p4-p3p4   +                        p2p3p4
1311 11oo,o11o,o1o1, p1+p2+p3+p4 - p1p3-p1p4- p3p4 + p1p3p4
1131 1o1o,o11o,oo11, p1+p2+p3+p4 - p1p2- p1p4- p2p4 + p1p2p4
1113 1oo1,o1o1,oo11, p1+p2+p3+p4 - p1p2-p1p3- p2p3 + p1p2p3

2220 11oo,1o1o,o11o, p1+p2+p3 - p1p2p3
2202 11oo,1oo1,o1o1, p1+p2+ p4 - p1p2p4
2022 1o1o,1oo1,oo11, p1+ p3+p4 - p1p3p4
0222 o11o,o1o1,oo11, p2+p3+p4 - p2p3p4

2211 11oo,1o1o,o1o1, p1+p2+p3+p4 - p1p4-p2p3- p3p4 + p1p3p4+p2p3p4 - p1p2p3p4
2121 11oo,1o1o,oo11, p1+p2+p3+p4 - p1p4-p2p3-p2p4 + p1p2p4+ p2p3p4 - p1p2p3p4
2211 11oo,1oo1,o11o, p1+p2+p3+p4 - p1p3- p2p4-p3p4 + p1p3p4+p2p3p4 - p1p2p3p4
2112 11oo,1oo1,oo11, p1+p2+p3+p4 - p1p3- p2p3–p2p4 + p1p2p3+ p2p3p4 - p1p2p3p4
1221 11oo,o11o,oo11, p1+p2+p3+p4 - p1p3-p1p4- p2p4 + p1p2p4+p1p3p4 - p1p2p3p4
1212 11oo,o1o1,oo11, p1+p2+p3+p4 - p1p3–p1p4-p2p3 + p1p2p3+ p1p3p4 - p1p2p3p4
2121 1o1o,1oo1,o11o, p1+p2+p3+p4 - p1p2- p2p4–p3p4 + p1p2p4+ p2p3p4 - p1p2p3p4
2112 1o1o,1oo1,o1o1, p1+p2+p3+p4 - p1p2- p2p3- p3p4 + p1p2p3+ p2p3p4 - p1p2p3p4
1221 1o1o,o11o,o1o1, p1+p2+p3+p4 - p1p2- p1p4- p3p4 + p1p2p4+p1p3p4 - p1p2p3p4
1122 1o1o,o1o1,oo11, p1+p2+p3+p4 - p1p2- p1p4-p2p3 + p1p2p3+p1p2p4 - p1p2p3p4
1212 1oo1,o11o,o1o1, p1+p2+p3+p4 - p1p2-p1p3- p3p4 + p1p2p3+ p1p3p4 - p1p2p3p4
1122 1oo1,o11o,oo11, p1+p2+p3+p4 - p1p2-p1p3- p2p4 + p1p2p3+p1p2p4 - p1p2p3p4


Note: the sign of a term is the negation of $(-1)$ to the parity of the degree of the term only in the first and third block of rows — the degree-3 terms in the 4 rows in the middle block are all negative.

Saturday, 5 July 2014

tag removed - Signed minimum?

I am looking for references to papers which might have defined a 'signed minimum' equivalent to
$$smin(x,y) ::= left(frac{textit{signum}(x)+textit{signum}(y)}{2}right)cdot min(|x|,|y|) $$
where $textit(signum)(x)$ is $-1$ for $xlt 0$, $1$ for $xgt 0$ and an arbitrary (finite) value for $x=0$. Also, any simpler expression for $smin$ would be appreciated. The above definition works for $mathbb{R}, mathbb{Q}$, and $mathbb{Z}$.



Intuition: the 'signed minimum' between x and y is the one closest to $0$ if they both have the same sign, otherwise it's 0.

Thursday, 3 July 2014

differential topology - Tangent bundle of the long line

Question: Is the tangent bundle of the long line $L$ homeomorphic to $Ltimesmathbb R$? I'd guess that the answer doesn't depend on choice of differentiable structure, but maybe it does.



Motivation: One night at dinner, someone brought up a puzzle involving infinitely many prisoners standing in a line, and someone asked if there was a physical reason that the collection of prisoners had to be countable. In other words, might (one of) the directions in the physical universe be modeled after the long line?



The answer to that question is no: the universe has a metric, but the long line has no Riemannian structure. The standard explanation for this is that a Riemannian manifold is metrizable, and a non-paracompact space isn't. Without using fancy theorems, one could instead suppose that $L$ was Riemannian and look at the exponential map starting at a point $x$ going in the increasing direction. This is an increasing function from $mathbb R$ to $L$, so it converges to some point $y$. Then the exponential map from $y$ downwards reaches $x$ in finite time, contradiction.



In any case, the basic result is that $L$ is not Riemannian, so its tangent bundle must be nontrivial, but only in the differential sense. One could try to instead consider a continuous metric (if the tangent bundle were indeed topologically trivial), but this wouldn't give rise to an exponential map, nor, as far as I know, a metric space structure on $L$.

Why is our solar system "tipped" about 63° with respect to the plane of our galaxy?

What you could think at first, regarding the orientation of any planetary system, is that it should be roughly in the plane of the galaxy, simply by angular momentum conservation.



But, when you take a look at observations, you see that protoplanetary disks orientation is not what you would expect, with no preferential orientation (protoplanetary disks are embryo of planetary systems, that makes them interesting). In the following figure, the orientation corresponds to the inclination between the line of sight and the rotation axis of the disk.



disk orientation (own work, CC-by-nc-sa)



Why there is this distribution of orientation?



The angular momentum scenario is nice but far to simple: star formation occurs in gas clouds in the interstellar medium, and these clouds are known to be turbulent (Larson, 1981). Turbulence simply disrupts the gas, and is dominent over the global angular momentum of the cloud. Actually, you can even test that with numerical simulations of star formation: put an initial angular momentum consistent with observations, and some turbulence (subsonic or slightly supersonic, also in accordance to observations), and you will get a misalignement of the rotation axis, due to turbulence.

Wednesday, 2 July 2014

surface - Could an impact have resurfaced Venus 300 million years ago?


Would an impact event leave visible traces like impact basins, or
could the entire surface melt and reform as it is today, as I suppose
Earth did when the Moon formed? Could Venus have been a very different
planet up until 0.3 billion years ago? How could one find out, what
kind of investigation would be needed?




Certainly giant impacts were fairly common, though I don't think they were common any more as recent as 300-500 million years ago. That would have been a kind of unusual event. Mostly they occurred when the solar-system was young.



Mercury probably had one



Venus may have too - and I recommend this article as very loosely related.



The Earth had one - as you know.



and Mars too, but not quite as "giant", though still very big and it left a visible mark that spans nearly 1/2 of Mars' surface. See Here and Here.



The Moon even had one and there is evidence in the thickness of the Moon's crust and location of it's volcanic magma pools/dark spots. See Here and Here.



The point I'm trying to make with all these examples is that unless the Impact is very very large, it would leave a impact footprint of sorts. The impact has to be large enough to not only liquefy the surface but to also allow the surface enough time to reset in mostly uniform layers so you don't get 1/2 of the planet with a thicker crust than the other half. Impacts of that size are statistically improbable, within the last 300 million years. Most of the big impacts happened long before then.



Wikipedia suggests that there's other reasons for believing the global resurfacing though interior heat idea. Venus Global resurfacing event - Wikipedia - Venus lack of a magnetic field, it's influx and loss of water and D to H ratio and it's Sulfur content. Do these have had to happened through internal heat resurfacing? Now, the article says there's no proof, only some pretty good evidence, so it it possible Venus resurfacing was due to an impact? I see no reason why it wouldn't be possible.



Given Venus' Heat trapping atmosphere, higher temperature and greater gravity, which also means, greater impact velocity, at least escape velocity of about 10 km/s where as the Mars impact that covers a hemisphere was thought to be about 5 km/s and the impacting object into Mars, at least 1,600 km in radius which is nearly the size of the Moon. You might not need that big to impact Venus and create a complete resurfacing, but I wouldn't think you could go too much smaller.



I love the Venus had a moon that crashed into it idea cause that seems to work on a few levels. A giant impact that created Venus could have happened 4 and change billion years ago and created a Venusian Moon, which over time, could both slow Venus' rotation and work it's way towards Venus in the process, eventually crashing into it. It seems at least, possible, though I'm not sure what evidence could be gathered to prove it. I like your idea a lot and I think it's much more likely than this one: Earth stole the Moon from Venus.



As to Venus being very different prior to it's resurfacing. The sun was only about 3%-5% less luminous when Venus resurfaced and given Venus' proximity to the sun, I have a very hard time seeing Venus as not already in a run-away greenhouse effect even then. I love the Venus once had oceans and maybe life hypothesis, but with a CO2 rich atmosphere, it's hard for me to see it as likely. My guess is that Venus was hot and virtually without water 300 million years ago too. Unless it somehow stored a huge amount of it's CO2 underground, I have a hard time not seeing it as too hot for oceans 300-500 million years ago. Maybe, but I'm skeptical. It would be very neat to have a picture of what Venus was like prior to it's resurfacing and over the hundreds of millions and billions of years before that.



Hope that wasn't too long, and I welcome corrections.

star gazing - Why do the Pleiades look clearer when viewed indirectly?

What you describe is a technique called averted vision and takes advantage of the arrangement of cones and rods, two types of light sensitive elements within the eye:




The retina contains two types of photoreceptors, rods and cones. The
rods are more numerous, some 120 million, and are more sensitive than
the cones. However, they are not sensitive to color. The 6 to 7
million cones provide the eye's color sensitivity and they are much
more concentrated in the central yellow spot known as the macula. In
the center of that region is the " fovea centralis ", a 0.3 mm
diameter rod-free area with very thin, densely packed cones.




Shorter, the cones are sensitive to color and work best in bright light, while the rods are most effective in low light conditions. And since the rods are concentrated off center of retina, our night vision is best off center. This averted vision technique can be used with a naked eye, as well as with any optical equipment like binoculars or telescopes, although it can be a bit of a bother with eyeglasses and contact lenses. Especially with the latter, using this advert vision technique can cause additional irritation of the cornea.



As per "beating the system" part, with a bit of practice, you could train yourself to view dim objects better with the corner of your eyes at will, and not merely by chance or for only short periods of time. It might seem hard with first few tries, but it really isn't and you soon get used to not looking directly towards the observed object, countering what we tend to do naturally, direct our eyes towards the object of our observations. Practice makes perfect, but there is no long term and unwanted side effects to doing this, and you'll still be able to direct your eyes towards whatever you'll want to see clearer, just as you did before.



There is one other thing to add to techniques to observing the night skies. Namely, our eyes need some time to adapt to darkness, and with our more frequent staring in computer monitors, television and other illuminated sources, our pupils need even longer to dilate and let the maximum of light to fall on the retina. So for best results, rest your eyes and do not expose them to anything except red light for 30 minutes prior to observing. This allows your pupils to expand to their maximum diameter and build up the levels of optical pigments, which are rapidly lost if exposed to bright light.



It is also important to observe with both of your eyes open to avoid fatigue. If you find this difficult, you may try covering the other eye with your hand or an eye patch. It should also be mentioned that both smoking and drinking cause deterioration in your sight. Your blood-sugar levels are also important and a low level of blood sugar causes lowering of sensitivity of eyes.



Lastly, your eyes will tire using this technique, just as they would by focusing for longer periods of times as you normally would. Look away every now and then, preferably not towards too strong sources of light, but still at least slightly brighter than those you previously observed. Also stretch your legs and blink fast a few times. Stretch your back too, and relax your bottom. You might find it funny, but this technique of relaxing and contracting muscles of that part of our physique has been used by trained spies to cheat polygraph tests that also measure pupil dilation. How's that for "beating the system", huh? It is by far the fastest way of relaxing your eyes, at least that I know of.