Thursday, 31 August 2006

real analysis - Why is Lebesgue integration taught using positive and negative parts of functions?

Well, it is not true that Riemann integral and series avoid the distinction altogether.



If you want to define improper Riemann integrals, you can follow two ways. Say you want to define $int_a^b f(x) dx$, where $a, b$ are not necessarily finite and $f$ need not be bounded.



Either you split $f$ into the positive and negative part, or you define it as a limit of the truncated functions on a truncated domain, something like



$$lim_{t to +infty} lim_{s to a^+} lim_{r to b^-} int_s^r max { min { f(x), t }, -t } dx $$
But then the result, for functions not in $L^1$, depends on the way you choose to go to the limit.



Exactly the same happens for series: for those which are not absolutely convergent, the result depends on the order of summation.



The trick to consider $f_{+}$ and $f_{-}$ allows one to consider improper integrals, which may as well be infinite, and to declare that the integral of $f$ does not make sense in those cases where the order of the limits is relevant.



Of course you know all this, but I post it as an answer since it would be too long for a comment, so that you can comment to explain why this reason is not enough for you.

intuition - Does anyone know an intuitive proof of the Birkhoff ergodic theorem?

I don't know whether this helps or whether you've already seen this before, but this made a lot more intuitive sense to me than the combinatorial approach in Halmos's book.



The key point in the proof is to prove the maximal ergodic theorem. This states that if $M_T$ is the maximal operator $M_T f= sup_{n >0} frac{1}{n+1} (f + Tf + dots + T^n f)$, then $int_{M_T f>0} f geq 0$. Here $T$ is the associated map on functions coming from the measure-preserving transformation.



This is a weak-type inequality, and the fact that one considers the maximal operator isn't terribly surprising given how they arise in a) the proof of the Lebesgue differentation theorem (namely, via the Hardy-Littlewood maximal operator $Mf(x) = sup_{t >0} frac{1}{2t} int_{x-t}^{x+t} |f(r)| dr$. b) In the theory of singular integrals, one can define a maximal operator in the same way and prove that it is $L^p$-bounded for $1 < p < infty$ and weak-$L^1$ bounded in suitably nice homogeneous cases (e.g. the Hilbert transform). One of the consequences of this is, for instance, that the Hilbert transform can be computed a.e. via the Cauchy principal value of the usual integral. c) I'm pretty sure the boundedness of the maximal operator of the partial sums of Fourier series is used in the proof of the Carleson-Hunt theorem. So using maximal operators (and, in particular, weak bounds on them) to establish convergence is fairly standard. Once the maximal inequality has been established, it isn't usually very hard to get the pointwise convergence result, and the ergodic theorem is no exception.



The maximal ergodic theorem actually generalizes to the case where $T$ is an operator of $L^1$-norm at most 1, and thinking of it in a more general sense might meet the criteria of your question. In particular, let $T$ be as just mentioned, and consider $M_T$ described in the analogous way. Or rather, consider $M_T'f = sup_{n geq 0} sum_{i=0}^n T^if$. Clearly $M_T'f >0$ iff $M_Tf >0$. Moreover, $M_T'$ has the crucial property that $T M_T' f + f = M_T' f$ whenever $M_Tf>0$.



Therefore,
$int_{M_T'f>0} f = int_{M_T'f>0} M_T'f - int_{M_T'f>0} TM_T'f.$ The first part is in fact $||M_T'f||_1$ because the modified maximal operator is always nonnegative. The second part is at most $||T M_T'f|| leq ||M_T'f||$ by the norm condition. Hence the difference is nonnegative.



Perhaps this will be useful: let $M$ be an operator (not necessarily linear) sending functions to nonnegative functions such that $(T-I)Mf = f$ wherever $Mf>0$, for $T$ an operator of $L^1$-norm at most 1). Then $int_{Mf>0} f geq 0$. The proof is the same.

Tuesday, 29 August 2006

ct.category theory - What precisely Is "Categorification"?

The Wikipedia answer is one answer that is commonly used: replace sets with categories, replace functions with functors, and replace identities among functions with natural transformations (or isomorphisms) among functors. One hopes for newer deeper results along the way.



In the case of work of Lauda and Khovanov, they often start with an algebra (for example ${bf C}[x]$ with operators $d (x^n)= n x^{n-1}$ and $x cdot x^n = x^{n+1}$ subject to the relation $d circ x = x circ d +1$) and replace this with a category of projective $R$-modules and functors defined thereupon in such a way that the associated Grothendieck group is isomorphic to the original algebra.



Khovanov's categorification of the Jones polynomial can be thought of in a different way even though, from his point of view, there is a central motivating idea between this paragraph and the preceding one. The Khovanov homology of a knot constructs from the set of $2^n$ Kauffman bracket smoothing of the diagram ($n$ is the crossing number) a homology theory whose graded Euler characteristic is the Jones polynomial. In this case, we can think of taking a polynomial formula and replacing it with a formula that inter-relates certain homology groups.



Crane's original motivation was to define a Hopf category (which he did) as a generalization of a Hopf algebra in order to use this to define invariants of $4$-dimensional manifolds. The story gets a little complicated here, but goes roughly like this. Frobenius algebras give invariants of surfaces via TQFTs. More precisely, a TQFT on the $(1+1)$ cobordism category (e.g. three circles connected by a pair of pants) gives a Frobenius algebra. Hopf algebras give invariants of 3-manifolds. What algebraic structure gives rise to a $4$-dimensional manifold invariant, or a $4$-dimensional TQFT? Crane showed that a Hopf category was the underlying structure.



So a goal from Crane's point of view, would be to construct interesting examples of Hopf categories. Similarly, in my question below, a goal is to give interesting examples of braided monoidal 2-categories with duals.



In the last sense of categorification, we start from a category in which certain equalities hold. For example, a braided monoidal category has a set of axioms that mimic the braid relations. Then we replace those equalities by
$2$-morphisms that are isomorphisms and that satisfy certain coherence conditions. The resulting $2$-category may be structurally similar to another known entity. In this case, $2$-functors (objects to objects, morphisms to morphisms, and $2$-morphisms to $2$-morphisms in which equalities are preserved) can be shown to give invariants.



The most important categorifications in terms of applications to date are
(in my own opinion) the Khovanov homology, Oszvath-Szabo's invariants of knots, and Crane's original insight. The former two items are important since they are giving new and interesting results.

Monday, 28 August 2006

differential equations - Does a (smooth, constant-rank, integrable) distribution have a (local) basis of divergence-free vector fields?

In coordinate-free language, my question is as follows. Let $M$ be an $n$-dimensional manifold with volume form, and let $mathcal D$ be a smooth (integrable, if necessary) distribution with constant rank $kleq m$ (by which I mean that $mathcal D$ is a subbundle of the tangent bundle, spanned by $k$ many every-linearly-independent smooth vector fields). Given a point $pin M$, is there necessarily a neighborhood $Uni p$ and vector fields $w_1,dots,w_k$ on $U$ so that each $w_a$ is divergence-free (with respect to the volume form) and so that $mathcal D|_U = text{span}{w_1,dots,w_k}$?



Since my question is local, I will ask it again on $mathbb R^n$ in coordinates. I will let $x^i$, $i=1,dots,n$ be the standard coordinates on $mathbb R^n$. Suppose you are given smooth vector fields $v_a(x) = sum_{i=1}^n v_a^i(x) frac{partial}{partial x^i}$ for $a=1,dots,k$, where $kleq n$, and suppose moreover that for each $x$, the set of vectors ${v_1(x),dots,v_k(x)}$ is linear independent. (Put another way, $v$ is an everywhere-full-rank $(ktimes n)$-matrix-valued function.) I'm looking for a smooth ${rm GL}(k)$-valued-function $m(x) = {m^a_b(x)}_{a,b=1}^k$ in a neighborhood of $0in mathbb R^n$ so that for each $b=1,dots,k$,
$$ sum_{i=1}^n sum_{a=1}^k frac{partial}{partial x^i}bigl[ m^a_b(x),v^i_a(x) bigr] = 0 $$



If such an $m$ (or, if you prefer, ${w_1,dots,w_k}$) exists, the natural questions are:



  • Locally, how unique is it? Certainly it can by multiplied by any constant invertible $ktimes k$ matrix. Is this it? I.e., if we fix the matrix (spanning set) at $0$ ($p$), does it fix the value in an open neighborhood?

  • Can "local" be replaced by "global"? (I'm not even sure whether a rank-$k$ smooth distribution, which for me is a subbundle that's locally spanned by $k$ everywhere-independent vector fields, can be globally spanned by $k$ everywhere-independent vector fields.)

Finally, I should remark that for my application, my distribution is integrable — does this matter for whether the answer is yes? Also, for my application, not only do I only need the result locally, but I actually am working in the formal neighborhood of a point, so for my purposes it's fine if all smooth functions are replaced by formal power series. Occasionally, but rarely, questions of existence of solutions to differential equations are easy in formal land but hard even in smooth land.



If you're feeling extra smart, the last generalization I'd be interested in knowing the answer to is what happens if the rank of the distribution can drop. I.e. fix $kleq n$, and suppose that we have vector fields $v_1,dots,v_k$, which may be linearly dependent at $0$. When is there an (invertible) $(ktimes k)$-matrix-valued function $m$ so that the above holds? The answer is a resounding "no" when $k=n=1$, but I have no intuition for higher values.



My motivation is that integrable distributions are the most general thing (that I know of, anyway), that can act by smooth "symmetries" of a manifold. I would like to know if every smooth symmetry can be taken to be volume-preserving. For a different interpretation of this general question, see e.g. Moser, On the volume elements on a manifold, Trans. Amer. Math. Soc., 1965, vol. 120, pp. 286--294, MR0182927, in which it is proven that any globally-volume-preserving diffeomorphism is smoothly isotopic through globally-volume-preserving diffeomorphisms to a locally-volume-preserving diffeomorphism.

Sunday, 27 August 2006

ho.history overview - A place to find original papers

I would also advise (especially if they were published in french journals, such as the articles by Elie Cartan, Frechet, Henri Cartan)



GALLICA



and



NUMDAM



You can often download whole articles, depending on date and copyright.



Here you have an obituary for Sophus Lie in the Weekly Accounts of the French Science Academy in 1899 (CRAS 1899, p525).



3 important original articles from Sophus Lie, probably on GDZ for the first 2.



Sophus Lie, Über Complexe, insbesondere Linien- und Kugel-Complexe, mit Anwendung
auf die Theorie partieller Differentialgleichungen; Mathematische Annalen Vol 5, pp145-
256 (1872)



Sophus Lie, Untersuchungen über Transformationsgruppen. II; Archiv for Mathematik
og Naturvidenskab vol 10, pp353-413 (Kristiania 1886)



Sophus Lie, unter Mitwirkung von Friedrich Engel, Theorie der Transformationsgruppen III, 1895. Printed as a book I think. I have it as chapters in the Chelsea reprint.

combinatorial geometry - Can any vertex remain when removing halfspaces from a projectively transformed polytope?

Let P be a simple polytope defined as an intersection of n halfspaces.



A facet F of P, supported by halfspace H, is removable if the intersection of the remaining (n-1) halfspaces is bounded. F is projectively removable if there exists a projective transformation π such that π(F) is removable from π(P).



It is easy to show that every facet of a simple d-polytope with at least (d+2) facets is projectively removable, since there is a projective transformation mapping (d+1) of the remaining halfspaces into a (d)-simplex.



Consider a vertex v defined by the intersection of d halfspaces and lying on a removable facet F supported by halfspace H. Suppose we translate H along its normal axis away from the center of the polytope out towards infinity. As we do so, some vertices will disappear from F as the corresponding halfspaces no longer intersect H within the polytope. At the time just before H leaves the polytope entirely there will be exactly d vertices left on F.



Define v to be a final vertex if, when H is translated out of P in this way, v is one of the d vertices remaining on F.




In a given realization of a polytope, some set of d vertices on a removable facet will be final. But if we apply an appropriate projective transformation, can any vertex v be made into a final vertex? In other words,




for any vertex v on a facet F of a simple polytope, is there a projective transformation πv such that both π(F) is removable and π(v) is final in π(P)?





Based on what I believe I understand about projective transformations, I can imagine that there is a projective transformation that shrinks F to an arbitrarily small point, and another transformation that perturbs F so that a given vertex "sticks out" enough to be a final vertex. However, I am not clear how to show from a formal definition of projective transformations that such transformations always exist or that there exists a given transformation that imposes both properties.



As a continuation of this question, let me ask: how can I gain more intuition about what properties of a polytope can be modified by projective transformation? Can facets be scaled arbitrarily and edges shifted around as I have suggested? I have taken a look at some texts suggested in Ziegler about projective geometry, but I'm interested in knowing more precisely what kind of things projective transformations can and cannot do to polytopes.

Saturday, 26 August 2006

gn.general topology - Cohomology and fundamental classes

This is a reply to Alon's comment, but it's too long to be a comment and is probably interesting enough to be an answer.



Here's an example Thom gives of a homology class that is not realized by a submanifold: let $X=S^7/mathbb Z_3$, with $mathbb Z_3$ acting by rotations, and $Y=X times X$.
Then $H^1(X;mathbb Z_3)=H^2(X;mathbb Z_3)=mathbb Z_3$ (and they are related by a Bockstein); let $u$ generate $H^1$ and $v=beta u$ be the corresponding generator of $H^2$. Then it can be shown that the class $u otimes vu^2 - v otimes u^3 in H^7(Y;Z_3)$ is actually integral (i.e., in $H^7(Y;Z)$), and its Poincare dual in $H_7$ cannot be realized by a submanifold (in fact, it can't be realized by any map from a closed manifold to $Y$, which need not be the inclusion of a submanifold). This is a natural example to consider because the first obstruction to classes being realized by submanifolds comes from a mod 3 Steenrod operation, and these are easy to compute on $Y$ because $X$ is the 7-skeleton of a $K(mathbb Z_3,1)$. Note that the class in question is 3-torsion, so trivially 3 times it is realized by a submanifold.

computability theory - A subset of all languages which is uncountable?

One way to take the question is to ask for a set of languages, for instance a complexity class, which is uncountable "for a good reason". In other words, that people study the class for some substantially different reason, and it's clearly convenient for it to be uncountable.



Probably the most common example is the class P/poly. This can be defined as polynomial-time computations with polynomial-length advice strings. (An advice string is any extra information that depends on the length of the input, but not on the specific value of the input.) By a famous structure theorem, it is also computations performed by polynomial-sized circuits on n input bits, without the requirement that the circuits can be built quickly by a Turing machine. This is clearly not a countable set of langauges, because anything recursive or non-recursive can be done with the input length.



On the other hand, it is a very useful class and construction. A stronger version of the P vs NP problem, inspired by circuit formulations of P vs NP, conjectures that NP is not contained in P/poly. And it is a theorem that BPP (randomized polynomial time) is contained in P/poly.

Thursday, 24 August 2006

dg.differential geometry - Complex Laplacians in spectral decompostions of graphs

The Laplacian of a graph is a useful tool in many kinds of graph decomposition problems. Since inspection of the eigenvector corresponding to the second smallest eigenvalue (the Fiedler vector) yields information about sparse cuts, it's often used to partition graphs or even cluster data. Of course the Laplacian on a graph can be viewed as a discrete analog of the Laplace-Beltrami operator on a general Riemannian manifold, and indeed much of the intuition for graph partitioning methods comes from this connection.



I recently heard a talk (based on this paper) (and found this paper) on the use of a Hermitian analog of the Laplacian to do graph partitioning. While the authors demonstrate the value of the method experimentally, for various graph partitioning problems, I was more interested in the underlying theory.



Is there any way to relate Hermitian structures on graphs to equivalent concepts on a manifold ? Maybe Hermitian manifolds, or Kahler manifolds ? Specifically, do the notions of sparse cuts (on graphs) and diffusion via the heat equation (on Riemannian manifolds) have parallels in the complex case ?

oa.operator algebras - A non-commutative Radon-Nikodym derivative.

Such t_0 is unique if its support is at most p, where p is the support of ϕ.
Note that we can replace t_0 by pt_0p and the support of pt_0p is at most p.



Without this additional condition t_0 is highly non-unique, because
we can replace t_0 by t_0 + q,
where q is an arbitrary self-adjoint element with support at most 1-p such that t_0 + q ≥ 0.
Using simple algebraic manipulations one can show that all solutions can be obtained in this way.



See Lemma 15.4 (page 104) in
Takesaki's book “Tomita's theory of modular Hilbert algebras and its applications”.
Electronic version: http://gen.lib.rus.ec/get?md5=ACC2A399A5C65C5CB2CCEE7CBEB3FAC3
[Note that Takesaki implicitly assumes that φ_0 is faithful, hence you need to introduce
an additional condition on the support of h.]




One would naively expect that any two such operators t_0 and t_1 would satisfy ϕ((t_1−t_0)^2)=0.




This is a trivial corollary of the above statement characterizing all possible solutions.

Monoidal structures on von Neumann algebras

The category of von Neumann algebras W* admits a variety of monoidal structures of three distinct flavors.



(1) W* is complete and therefore you have a monoidal structure given by the categorical product.



(2a) W* is cocomplete and therefore you also have a monoidal structure given by the categorical coproduct.



(2b) I suspect that there is also a “spatial coproduct”, just as there is a categorical
tensor product and a spatial tensor product (see below).
The spatial coproduct should correspond to a certain central projection in the categorical coproduct.
Perhaps the spatial coproduct is some sort of coordinate-free version
of the free product mentioned in Dmitri Nikshych's answer.



(3a) For any two von Neumann algebras M and N
consider the functor F from W* to Set that sends a von Neumann algebra L
to the set of all pairs of morphisms M→L and N→L with commuting images.
The functor F preserves limits and satisfies the solution set condition, therefore it is representable.
The representing object is the categorical tensor product of M and N.



(3b) There is also the classical spatial tensor product.
I don't know any good universal property that characterizes it except
that there is a canonical map from (3a) to (3b) and its kernel corresponds
to some central projection in (3a). Perhaps there is a nice description of this central projection.



Since your monoidal structure is of the third flavor and you don't want a monoidal
structure of the first flavor, I suggest that you try a monoidal structures of the second flavor.
I suspect that the spatial coproduct of two factors is actually a factor.
You are lucky to work with factors, because in the commutative case 2=3, in particular 2a=3a and 2b=3b.

Does category theory help understanding abstract algebra?

I can recommend several much better sources that will ease your transition into both abstract algebra and category theory, Silva.



Lang is far too difficult for a first brush with abstract algebra-and MacLane is even MORE difficult for a neophyte in algebra. Category theory has VERY far reaching conceptual implications for most of modern mathematics,not just algebra.So no,in principle,you don't have to learn abstract algebra to learn it-but that's where most mathematicians have infused it.This is because it's natural to organize types of structures into categories and that's really what algebra is all about:types of structures i.e. sets with binary relations on them.



There are a legion of great abstract algebra texts,but my favorite is E.B.Vinberg's A COURSE IN ALGEBRA,available through the AMS. It takes a very concrete,geometric approach and builds an extraordinary amount of algebra from first principles all the way up to the elements of commutative algebra, Lie algebras and groups and multilinear algebra.It will help you learn a great deal of algebra very quickly and without the confusion of learning category theory simultaneously. Another geometrically flavored-but a bit more challenging-book is the classic ALGEBRA by Micheal Artin. Indeed,I think the 2 books compliment each other very nicely. Mastering both books will give you a very good working knowledge of algebra and you'll be more then ready to tackle Lang's book after that.



As for category theory,the best introductory text I know is CATEGORY THEORY by Steven Awodey. Gentle,rigorous and masterly,it's the best book for undergraduates and the only one I'd use for a beginning course in category theory for students that don't have strong backgrounds in algebra. It's pricey,but totally worth it. One other very good-and short-book you should look for and I heartily recommend is T.S.Blyth's CATEGORIES-a terrific short introduction for any student with good mathematics background that wants just the basics in category theory. It's REALLY hard to find now,but if you can get a copy,by all means,do so.



That should help you out. Good luck!

Wednesday, 23 August 2006

nt.number theory - Wanted: A constructive version of a theorem of Furstenberg and Weiss

Dear RJS,



I think Tim Gowers is right - the problem seems too hard. Reasonably good bounds are known on (for example) the least $n geq 1$ for which $Vert n^2 sqrt{2} Vert leq epsilon$; one can find such an $n$ with $n leq epsilon^{-7/4 + o(1)}$. This is a result of Zaharescu [Zaharescu, A; Small values of $n^2alphapmod 1$. Invent. Math. 121 (1995), no. 2, 379--388.] Zaharescu in fact obtains this result for any $theta$ in place of $sqrt{2}$. From a cursory glance at the paper I see that he uses the continued fraction expansion for $theta$ and so it may be that one can slightly improve his bound in the particular case of $sqrt{2}$.



It is an old conjecture of Heilbronn that the right bound here should be $epsilon^{-1 + o(1)}$. I don't know off the top of my head whether any more precise conjectures have been made based on sensible heuristics either for this or for your original problem.



To get an explicit upper bound for your problem one can proceed quite straightforwardly using arguments due to Weyl. I don't think this is the right place to describe an argument in detail: there are several variants, and I first learnt this from a Tim Gowers course at Cambridge. See Theorem 3.10 of these notes:



http://www.math.cmu.edu/~af1p/Teaching/AdditiveCombinatorics/notes-acnt.pdf



If you really had to show that 627 is the answer to your specific problem, probably the best bet would be to inspect all the quadratics $n^2sqrt{2} + theta n + theta'$ for $theta,theta'$ in some rather dense finite subset of $[0,1]^2$ and show using a computer that each takes (mod 1) a value less than 0.009999 for some $n leq 627$. Painful!



The argument of Furstenberg and Weiss uses ergodic theory and so will not directly lead to an effective bound.



There are quite detailed conjectures about the fractional parts of $n^2sqrt{2}$ (and other similar sequences) due to Rudnick, Sarnak and Zaharescu, essentially encoding the fact that this sequence of fractional parts is expected to behave like a Poisson process. I don't think those conjectures are likely to be helpful in your context since, taken too literally, they would seem to suggest that there are arbitrarily long intervals without a number such that $Vert n^2 sqrt{2} Vert < 0.01$ - contrary to Furstenberg-Weiss.



Nonetheless let me point out a recent paper of Heath-Brown which is very interesting in connection with these matters:



http://arxiv.org/pdf/0904.0714v1.



One more point perhaps worthy of mention: sequences with bounded gaps are usually known as syndetic.

Tuesday, 22 August 2006

journals - How do I fix someone's published error?

I think the question is a bit too detailed. A short version is this: what do people do when they discover an error in other people's papers? Obviously, like the question explains, there is no universal rule - this all depends on the type of an error, relative importance of the results in the paper, relationship between a person who made an error (let's call her/him X) and who discovered it (Y), etc. Let me simply list some relatively standard options.



1) Y tells the error to X. X finds a way to fix it, publishes an "erratum" in the journal, on the arXiv and/or on his/her own webpage. Gives profuse thanks to Y (but only if Y is gives a permission to do so). Occasionally this a joint (X,Y) paper. Either way, this is the most desirable outcome.



1)' Even if the result is false in full generality, X should still publish an "erratum" saying "such-and-such weaker version survives", or even "every hope to prove such-and-such is lost forever"...



2) Y wants to remain anonymous, or X can't be bothered. Then Y writes a letter to editor in chief of the journal which published X's paper. It is their responsibility as much as X's. Let the editor(s) deal with the mess. This is the easiest way out (for Y).



2)' A slightly better way to remain anonymous is for Y (by agreement with the editors) publish a short erratum under an assumed name. I have seen this happen, but in a long run this does not work - eventually people find out who was the author (and in a couple of cases I know, MathSciNet rather counter productively links the pen name to Y). On the other hand, if you really want to remain anonymous, e.g. use an assumed name linked to a fake email account, your erratum submission will not be taken seriously (journals get quite a few crackpot submissions).



3) Y is heavily involved in the field and is writing an article/book (B) on the subject. Y doesn't know how to fix the error. Then sometimes it is a good idea to include this piece of math in the final remarks or an appendix. Y might want to be nice and inform X first, before making the error public. This is a good solid option which allows others to say "error in A was pointed out in B".



4) The error is fundamental, kills paper A, but Y knows how to fix it. Y should publish a new paper explaining the error in full, right in the introduction or the first section. Y should write the paper in such a way as if assuming that X will be refereeing the paper... On rare occasions, it can happen that later Z publishes a paper acknowledging an error in Y's paper, and claiming to have "finally" found "a definite proof", etc. Sometimes an unavoidable chaos ensues, but the good faith decision by Y to publish was still a good one.



5) Y can prove (by different means) a result which follows easily (or even a special case) of that by X. Y should still write a paper. Lots of delicacy is required in trying to explain the whole story. This is the hardest thing to do. Consult a senior expert before making the paper available.



6) In extreme cases, Y can just post a note on the arXiv (this happens occasionally, see the meta discussion), but let me strongly discourage this practice. It should only be used when no other recourse is available. When this kind of thing happens, the allegedly erroneous paper A is refereed, but the erratum is not, so the ousiders no longer know what to think. This can undermine the credibility of the field and turn people away from the working on the problem.



UPDATE: After reading other answers, I realized that I am answering a slightly different question. This is only meant to catalog the possibilities, not to endorse them or explain "how to get there". The latter is often really delicate and difficult, so don't try it if you are not sure! Although some of these outcomes are preferable to others, this is also on case-by-case. Finally, the order is somewhat arbitrary.

graph theory - Infinite Ramsey theorem with infinitely many colours

Clearly, it is possible to colour the edges of an infinite complete graph so that it does not contain any infinite monochromatic complete subgraph. Now what about the following?




Let $G$ be the complete graph with vertex set the
positive integers. Each edge of $G$ is then coloured c with probability $frac{1}{2^c}$, for
$c = 1, 2, dots$ What is the probability
that G contains an infinite
monochromatic complete subgraph?




It is unclear for me if the answer should be $0, 1$, or something in between.

Saturday, 19 August 2006

cohomology of moduli spaces

Let me tell you what I know about the cohomology of congruence subgroups of Sp_{2g}(Z). As far as cohomology with rational coefficients goes, this was determined by Borel. In the limit as g->infty, it is isomorphic to a polynomial algebra with generators in degrees 4k+2. See his paper



A. Borel, Stable real cohomology of arithmetic groups, Ann. Sci. ´Ecole Norm. Sup.
(4) 7 (1974), 235–272 (1975).



I don't know of many integral calculations. I calculated H1 of the level L congruence subgroups for L odd and g at least 3 in my paper "The abelianization of the level L mapping class group", which is available on my webpage (click my name for a link). This was also determined independently by Perron (unpublished) and M. Sato. Sato's paper is "The abelianization of the level 2 mapping class group", and is available on the arXiv. He also works out H_1 for L even.



Another paper with information on H^2 is my paper "The Picard group of the moduli space of curves with level structures", which is also available on my webpage.



As a remark, both of the papers of myself mentioned above are really papers about the mapping class group and the moduli space of curves, but I ended up proving results about PPAV's and Sp_{2g}(Z) along the way

ct.category theory - Is there a notion of "good" distributor/profunctor for model categories?

There's a natural generalization of Quillen adjunction.



Let $A$ and $B$ be model categories, and $M:A^{op}times Bto Set$ be a distributor/profunctor/module. This should be Quillen if whenever $i:ato a'$ is a cofibration in $A$ and $p:bto b'$ is a fibration in $B$, with either $i$ or $p$ a weak equivalence, then the induced map from $M(a',b)$ to the pullback $M(a,b)times_{M(a,b')}M(a',b')$ is surjective.

rt.representation theory - Isomorphism classes of nilpotent Lie algebras

I have found a counterexample while studying Lie algebra degenerations. Consider the following filiform nilpotent Lie algebra $mathfrak{g}$ of dimension $13$, given by the brackets with respect to a basis
$(e_1,ldots ,e_{13})$:
begin{align*}
[e_1,e_i] & = e_{i+1},quad 2le ile 12 \[0.2cm]
[e_2,e_3] & = e_5 \
[e_2,e_4] & = e_6 \
[e_2,e_5] & = frac{9}{10}e_7-e_9 \
[e_2,e_6] & = frac{4}{5}e_8-2e_{10} \
[e_2,e_7] & = frac{5}{7}e_9-frac{335}{126}e_{11}+ frac{22105}{15246}e_{13}\
[e_2,e_8] & = frac{9}{14}e_{10}-frac{125}{42}e_{12}\
[e_2,e_9] & = frac{9}{12}e_{11}-frac{4421}{1452}e_{13}\
[e_2,e_{10}] & = frac{8}{15}e_{12}\
[e_2,e_{11}] & = frac{27}{55}e_{13}\[0.5cm]
[e_3,e_4] & = frac{1}{10}e_{7}+e_{9}\
[e_3,e_5] & = frac{1}{10}e_{8}+e_{10}\
[e_3,e_6] & = frac{3}{35}e_9+frac{83}{126}e_{11}-frac{22105}{15246}e_{13}\
[e_3,e_7] & = frac{1}{14}e_{10}+frac{20}{63}e_{12}\
[e_3,e_8] & = frac{5}{84}e_{11}+frac{697}{10164}e_{13}\
[e_3,e_{9}] & = frac{1}{20}e_{12}\
[e_3,e_{10}] & = frac{7}{165}e_{13}\
[e_4,e_5] & = frac{1}{70}e_9+frac{43}{126}e_{11}+frac{22105}{15246}e_{13}\
[e_4,e_6] & = frac{1}{70}e_{10}+frac{43}{126}e_{12}\
[e_4,e_7] & = frac{1}{84}e_{11}+frac{7589}{30492}e_{13}\
[e_4,e_8] & = frac{1}{105}e_{12}\
[e_4,e_9] & = frac{1}{132}e_{13}\[0.5cm]
[e_5,e_6] & = frac{1}{420}e_{11}+frac{313}{3388}e_{13}\
[e_5,e_7] & = frac{1}{420}e_{12}\
[e_5,e_8] & = frac{3}{1540}e_{13}\[0.5cm]
[e_6,e_7] & = frac{1}{2310}e_{13}
end{align*}
Let $mathfrak{h}$ be the ideal generated by $e_3,ldots , e_{13}$ of codimension $2$. Then $mathfrak{g}/mathfrak{h}={ [e_1],[e_2]}$, and we
may write $L={ alpha[e_1]+beta[e_2]}$, where $alpha,beta$ vary over all complex numbers. Hence the preimage is $L(alpha,beta)={alpha e_1+beta e_2,e_3,ldots ,e_{13} }$, which is a $1$-codimensional ideal in $mathfrak{g}$. Now it is easy to see that
$$
L(1,alpha)simeq L(1,alpha') text{ if and only if } alpha=pm alpha'.
$$
So we obtain infinitely many non-isomorphic Lie algebras.

Friday, 18 August 2006

soft question - Dimension Leaps

Here is a closely related pair of examples from operator theory, von Neumann's inequality and the theory of unitary dilations of contractions on Hilbert space, where things work for 1 or 2 variables but not for 3 or more.



In one variable, von Neumann's inequality says that if $T$ is an operator on a (complex) Hilbert space $H$ with $|T|leq1$ and $p$ is in $mathbb{C}[z]$, then $|p(T)|leqsup{|p(z)|:|z|=1}$. Szőkefalvi-Nagy's dilation theorem says that (with the same assumptions on $T$) there is a unitary operator $U$ on a Hilbert space $K$ containing $H$ such that if $P:Kto H$ denotes orthogonal projection of $K$ onto $H$, then $T^n=PU^n|_H$ for each positive integer $n$.



These results extend to two commuting variables, as Ando proved in 1963. If $T_1$ and $T_2$ are commuting contractions on $H$, Ando's theorem says that there are commuting unitary operators $U_1$ and $U_2$ on a Hilbert space $K$ containing $H$ such that if $P:Kto H$ denotes orthogonal projection of $K$ onto $H$, then $T_1^{n_1}T_2^{n_2}=PU_1^{n_1}U_2^{n_2}|_H$ for each pair of nonnegative integers $n_1$ and $n_2$. This extension of Sz.-Nagy's theorem has the extension of von Neumann's inequality as a corollary: If $T_1$ and $T_2$ are commuting contractions on a Hilbert space and $p$ is in $mathbb{C}[z_1,z_2]$, then $|p(T_1,T_2)|leqsup{|p(z_1,z_2)|:|z_1|=|z_2|=1}$.



Things aren't so nice in 3 (or more) variables. Parrott showed in 1970 that 3 or more commuting contractions need not have commuting unitary dilations. Even worse, the analogues of von Neumann's inequality don't hold for $n$-tuples of commuting contractions when $ngeq3$. Some have considered the problem of quantifying how badly the inequalities can fail. Let $K_n$ denote the infimum of the set of those positive constants $K$ such that if $T_1,ldots,T_n$ are commuting contractions and $p$ is in $mathbb{C}[z_1,ldots,z_n]$, then $|p(T_1,ldots,T_n)|leq Kcdotsup{|p(z_1,ldots,z_n)|:|z_1|=cdots=|z_n|=1}$. So von Neumann's inequality says that $K_1=1$, and Ando's Theorem yields $K_2=1$. It is known in general that $K_ngeqfrac{sqrt{n}}{11}$. When $n>2$, it is not known whether $K_nltinfty$.



See Paulsen's book (2002) for more. On page 69 he writes:




The fact that von Neumann’s inequality holds for two commuting contractions
but not three or more is still the source of many surprising results and
intriguing questions. Many deep results about analytic functions come
from this dichotomy. For example, Agler [used] Ando’s theorem to deduce an
analogue of the classical Nevanlinna–Pick interpolation formula
for analytic functions on the bidisk. Because of the failure of a von
Neumann inequality for three or more commuting contractions, the analogous
formula for the tridisk is known to be false, and the problem of finding the
correct analogue of the Nevanlinna–Pick formula for polydisks
in three or more variables remains open.


mapping class groups - Comparing geometric intersection numbers.

Let $a$, $b$, and $c$ be simple closed curves in an orientable surface $S$ such that
$i(a,b) geq 2$, $i(a,c) geq 1$, and $i(b,c) = 0$.
Let $w$ be a nontrivial element of the free group $langle T_a,T_b rangle$ which is different from $T_a^p$, $p$ nonzero integer, and set $x = w(a)$.
If $i(a,x) > i(b,x)$ and $n$ is a nonzero integer, is there a way to compare $i(a,T_c^n(x))$ and
$i(b,T_c^n(x)) = i(b,x)$ ?

gn.general topology - Convexity Theorem of Hamiltonian actions - the connectedness part

Suppose we have a Hamiltonian action of a torus T=T^m=R^m/Z^m on a compact, connected symplectic manifold M. According to the convexity theorem, we know every fiber of the momentum map mu: M--->R^m is connected. My question here is about the proof.



We assume the Hamiltonian action is effective without loss of generality, i.e., only the zero point of T fixes M. I already know that the set of regular values of mu is dense in mu(M), also the set of points eta in mu(M) with (eta_1, ..., eta_(m-1)) a regular value for the reduced momentum map (mu_1, ..., mu_(m-1)) is dense in mu(M). I also know that the fiber of eta is connected whenever (eta_1, ..., eta_(m-1)) a regular value for the reduced momentum map. "Since the set of such points is dense in mu(M) it follows by continuity that the fiber of eta is connected for every regular value eta." (in the book by McDuff and Salamon), and this is my first question. My second question is that then we can imply that every fiber of mu is connected?



Notice that here the Hamiltonian action must play a special role, as the following is not true:



Suppose f: M--->N is a smooth map between two smooth manifolds, with M compact and connected, and suppose there is a dense subset of f(M) where each fiber is connected, then each fiber of f is connected.
Example: consider a natural smooth surjection from S^1 to the figure eight. The fiber over the nodes of the figure eight has two points but every other fiber is a single point.



If anyone knows how to prove the connectedness part of the Convexity Theorem, could you please show us? Thank you very much!

dg.differential geometry - What is the symbol of a differential operator?

I find Wikipedia's discussion of symbols of differential operators a bit impenetrable, and Google doesn't seem to turn up useful links, so I'm hoping someone can point me to a more pedantic discussion.



Background



I think I understand the basic idea on $mathbb{R}^n$, so for readers who know as little as I do, I will provide some ideas. Any differential operator on $mathbb{R}^n$ is (uniquely) of the form $sum p_{i_1,dotsc,i_k}(x)frac{partial^k}{partial x_{i_1}dotspartial x_{i_k}}$, where $x_1,dotsc,x_n$ are the canonical coordinate functions on $mathbb{R}^n$, the $p_{i_1,dotsc,i_k}(x)$ are smooth functions, and the sum ranges over (finitely many) possible indexes (of varying length). Then the symbol of such an operator is $sum p_{i_1,dotsc,i_k}(x)xi^{i_1}dotsoxi^{i_k}$, where $xi^1,dotsc,xi^n$ are new variables; the symbol is a polynomial in the variables ${xi^1,dotsc,xi^n}$ with coefficients in the algebra of smooth functions on $mathbb{R}^n$.



Ok, great. So symbols are well-defined for $mathbb{R}^n$. But most spaces are not $mathbb{R}^n$ — most spaces are formed by gluing together copies of (open sets in) $mathbb{R}^n$ along smooth maps. So what happens to symbols under changes of coordinates? An affine change of coordinates is a map $y_j(x)=a_j+sum_jY_j^ix_i$ for some vector $(a_1,dotsc,a_n)$ and some invertible matrix $Y$. It's straightforward to describe how the differential operators change under such a transformation, and thus how their symbols transform. In fact, you can forget about the fact that indices range $1,dotsc,n$, and think of them as keeping track of tensor contraction; then everything transforms as tensors under affine coordinate changes, e.g. the variables $xi^i$ transform as coordinates on the cotangent bundle.



On the other hand, consider the operator $D = frac{partial^2}{partial x^2}$ on $mathbb{R}$, with symbol $xi^2$; and consider the change of coordinates $y = f(x)$. By the chain rule, the operator $D$ transforms to $(f'(y))^2frac{partial^2}{partial y^2} + f''(y) frac{partial}{partial y}$, with symbol $(f'(y))^2psi^2 + f''(y)psi$. In particular, the symbol did not transform as a function on the cotangent space. Which is to say that I don't actually understand where the symbol of a differential operator lives in a coordinate-free way.



Why I care



One reason I care is because I'm interested in quantum mechanics. If the symbol of a differential operator on a space $X$ were canonically a function on the cotangent space $T^ast X$, then the inverse of this Symbol map would determine a "quantization" of the functions on $T^ast X$, corresponding to the QP quantization of $mathbb{R}^n$.



But the main reason I was thinking about this is from Lie algebras. I'd like to understand the following proof of the PBW theorem:




Let $L$ be a Lie algebra over $mathbb{R}$ or $mathbb{C}$, $G$ a group integrating the Lie algebra, $mathrm{U}L$ the universal enveloping algebra of $L$ and $mathrm{S}L$ the symmetric algebra of the vector space $L$. Then $mathrm{U}L$ is naturally the space of left-invariant differential operators on $G$, and $mathrm{S}L$ is naturally the space of symbols of left-invariant differential operators on $G$. Thus the map Symbol defines a canonical vector-space (and in fact coalgebra) isomorphism $mathrm{U}Ltomathrm{S}L$.


Thursday, 17 August 2006

dg.differential geometry - How to shown that the Tangent Bundle of a vector space is a Vector Bundle

Hello,



I have the following question about the tangent bundle $T_M =
bigcup_{p in M} {p} times T_p M$ defined on a manifold $M$ of class $C^r$
modeled on a normed space $X$. My problem is showing that the tangent bundle
also forms a vector bundle. I found the following definition of a vector
bundle



A vector bundle is a tuple $E, B, pi, F, mathcal{T}$ where $E, B$ are
topological spaces, $pi : E rightarrow B$ a continuous surjection, $F$ a
normed metric space, $mathcal{T}$ is a family ${U_i, varphi_i }_{i in I}$
of homeomorphism $varphi_i : U_i times F rightarrow pi^{- 1} (U_i)$ with
$B = bigcup_{i in I} U_i$ such that



  • $forall b in B succ pi^{- 1} ({b})$ has the structure of a
    normed vectorspace


  • $forall i in I$ we have $forall x in U_i$ and $forall v in F$
    that $pi (varphi_i (x, v)) = x$


  • $forall i in I, x in U_i$ the map $varphi_i^{(x)} : F rightarrow
    pi^{- 1} ({x})$ defined by $varphi_i^{(x)} (v) = varphi_i (x, v)$ is a
    linear function between the vector spaces $F$ and $pi^{- 1} ({x})$


We call



  • $E$ the total space of the vector bundle


  • $B$ the base space of the vector bundle


  • $pi$ is the projection map of the bundle


  • $mathcal{T}$ is called a trivialization and $(U_i, varphi_i)$ is
    called a trivializing neighborhood.
    end{itemize}


Now for the tangent bundle it is easy to see that $T_M$ is the total space and
$pi : T_M rightarrow M : (x, v) rightarrow x$ is the projection, $M$ is the
base space and I think we can equate $F$ with $X$, but how do you go on in
finding a trivialization. I thought first about using the induced atlas on
$T_M$ (that makes the tangent bundle a differentiable manifold of class $C^{r
- 1}$ modelled on $X times X$ but its mappings has not the correct format.



My problem with using the induced atlas as a trivialization is that it is of the form ${U_i, varphi_i }_{i in I}$ $varphi_i : pi^{- 1} (U_i) rightarrow varphi (U_i) times X$ and using
$varphi_i^{- 1} : varphi (U_i) times X rightarrow pi^{- 1} (U_i)$ I'm
almost there but I have still not found a homeomorphism of the form $U_i
times X rightarrow pi^{- 1} (U_i)$. The book I'm reading is talking about a
tangent space and says it is vector bundle but does not define a vector
bundle at all, so I looke up the definition of a vector bundle and failed to



Maybe I'm missing on the definition of a vector bundle (most examples I found
on the internet are about finite dimensional spaces ).



Can anybody help me?



Thanks a lot in advance



Marc Mertens

Wednesday, 16 August 2006

reference request - Low dimensional nilpotent Lie algebras

Classification of nilpotent Lie algebras in characteristic 0 is an old problem,
with a lot of literature. For the dimensions up to 6 there is a finite list.
Among the many relevant papers on MathSciNet, I'll list just a few:



MR2372566 (2009a:17027) 17B50 (17B20 17B30)
Strade, H. (D-HAMBMI)
Lie algebras of small dimension.
Lie algebras, vertex operator algebras and their applications, 233–265, Contemp. Math., 442,
Amer. Math. Soc., Providence, RI, 2007.



MR0498734 (58 #16802) 17B30
Skjelbred, Tor; Sund, Terje
Sur la classification des alg`ebres de Lie nilpotentes. (French. English summary)
C. R. Acad. Sci. Paris S´er. A-B 286 (1978), no. 5, A241–A242.



MR855573 (87k:17012) 17B30
Magnin, L. (F-DJON-P)
Sur les alg`ebres de Lie nilpotentes de dimension 7. (French. English summary) [Nilpotent
Lie algebras of dimension 7]
J. Geom. Phys. 3 (1986), no. 1, 119–144.



MR1737529 (2001i:17010) 17B30 (17B05)
Tsagas, Gr. (GR-THESS-DMP)
Classification of nilpotent Lie algebras of dimension eight.
J. Inst. Math. Comput. Sci. Math. Ser. 12 (1999), no. 3, 179–183.



EDIT: This is a somewhat random sample (I'm not a specialist), but these papers recall results
for low dimensions and have many references to older literature. The reviews in Math
Reviews (MathSciNet) are helpful to look at, if you have access.
There is also a fairly
modern book, which is very high-priced and probably difficult to access:



MR1383588 (97e:17017)
Goze, Michel(F-HALS); Khakimdjanov, Yusupdjan(UZ-AOS)
Nilpotent Lie algebras.
Mathematics and its Applications, 361. Kluwer Academic Publishers Group, Dordrecht, 1996. xvi+336 pp. ISBN: 0-7923-3932-0
17B30 (17-02 17B40 17B56)

ca.analysis and odes - Euclidean volume of the unit ball of matrices under the matrix norm

I worked out the answer for the 2 by 2 case as well.



First, when dealing with 2 by 2 matrices in general, a convenient variable change is:



a->(w+x)/sqrt{2},d->(w-x)/sqrt{2},c->(y-z)/sqrt{2},b->(y+z)/sqrt{2}.


Then a^2+b^2+c^2+d^2 = w^2+x^2+y^2+z^2. And the determinant (ad-bc) = (1/2)*(x^2+y^2-w^2-z^2).



(Aside: this set of coordinates lets you see for instance that the set of rank 1 matrices in the space of 2D matrices realized as R^4 is a cone over the Clifford torus, since x^2+y^2 = w^2+z^2 on a sphere x^2+y^2+w^2+z^2=r^2 implies x^2+y^2 = r^2/2 and w^2+z^2 = r^2/2, which are scaled equations for a flat torus)



Let r1^2 = x^2+y^2, r2^2 = w^2+z^2. (These are radial coordinates of two cylindrical coordinate systems filling out 4-space). Then the norm squared is:



(1/2)*(r1^2+r2^2 + sqrt{ (r1^2+r2^2)^2 - (r1^2-r2^2)^2 })


When this is less than one, this corresponds to the region plotted below:



spectral norm ball



Note that each point in the r1,r2 picture corresponds to a different "torus", x^2+y^2=r1^2, w^2+z^2=r2^2.



We can now integrate over the shaded in region, int_{region} dw dx dy dz.



This 4-D integral can be reduced to 2D using r1 and r2, since dx dy = 2π r1 dr1, dw dz = 2π r2 dr2:



(4π^2) int_{region} dr1 dr2 r1 r2 


Now, note that we can rewrite r2 in terms of r1. In particular, after some manipulation of our norm, the shaded in region is defined by r2^2 ≤ 2-2sqrt{2}r1+r1^2=(sqrt{2}-r1)^2. Hence r2≤ sqrt{2}-r1, and we can evaluate the r2 integral:



(4π^2) int_{r1=0}^sqrt{2} dr1 r1 int_{r2=0}^{sqrt{2}-r1} r2 dr2 
= (4π^2) int_{r1=0}^sqrt{2} dr1 r1 (sqrt{2}-r1})^2/2
= (4π^2) (1/6)


This yields 2π^2/3, as Armin found.

ac.commutative algebra - Primitive element theorem without building field extensions

OK, I have a proof which meets your conditions. I relied on this write up of the standard proof as a reference.



Lemma 1: Let $K/F$ be an extension of fields, and let $f(x)$ and $g(x)$ be polynomials in $F[x]$. Let $d_F(x)$ be the GCD of $f$ and $g$ in $F[x]$ and let $d_K(x)$ be the GCD of $f$ and $g$ in $K[x]$. Then $d_F(x)$ and $d_K(x)$ coincide up to a scalar factor.



Proof: Since $d_F(x)|f(x)$ and $d_F(x)|g(x)$ in $K[x]$, we have $d_F(x)|d_K(x)$. Now, there are polynomials $p(x)$ and $q(x)$ in $F[x]$ such that $f(x) p(x) + g(x) q(x) = d_F(x)$. So $d_K(x) | d_F(X)$. Since $d_F(x)$ and $d_K(x)$ divide each other, they only differ by a scalar factor.



Lemma 2: Let $f(x)$ and $g(x)$ be polynomials with $g(0) neq 0$. Then, for all but finitely many $t$, the polynomials $f(tx)$ and $g(x)$ have no common factor.



Proof: Let $f(x) = f_m x^m + cdots + f_1 x + f_0$ and $g(x) = g_n x^n + cdots + g_1 x + g_0$. If $f(tx)$ and $g(x)$ HAVE a nontrivial common factor, then there are polynomials $p(x)$ and $q(x)$, of degrees $leq n-1$ and $leq m-1$, such that
$$f(tx) p(x)=g(tx) q(x).$$
This is $m+n$ linear equations on the $m+n$ coefficients of $p$ and $q$.
Writing this out in coefficients, the matrix
$$begin{pmatrix}
f_m t^m & cdots & f_1 t & f_0 & 0 & 0 & cdots & 0 \
0 & f_m t^m & cdots & f_1 t & f_0 & 0 & cdots & 0 \
ddots \
0 & 0 & cdots & 0 & f_m t^m & cdots & f_1 t & f_0 \
g_n & cdots & g_1 & g_0 & 0 & 0 & cdots & 0 \
0 & g_n & cdots & g_1 & g_0 & 0 & cdots & 0 \
ddots \
0 & 0 & cdots & 0 & g_n & cdots & g_1 & g_0
end{pmatrix}$$
has nontrivial kernel. The determinant of this matrix is a polynomial in $t$, with leading term $(f_m)^n (g_0)^n t^{mn} + cdots$. (Recall that $g_0 neq 0$.) So, for all but finitely many $t$, this matrix has nonzero determinant and $f(tx)$ and $g(x)$ are relatively prime. QED.



Now, we prove the primitive element theorem (for infinite fields). Let $alpha$ and $beta$, in $K$, be algebraic and separable over $F$, with minimal polynomials $f$ and $g$. We will show that, for all but finitely many $t$ in $F$, we have $F(alpha - t beta) = F(alpha, beta)$.



Let $f(x) = (x -alpha) f'(x -alpha)$ and $g(x) = (x - beta) g'(x - beta)$. Since $beta$ is separable, we know that $g'(0) neq 0$. Note that $f'$ and $g'$ have coefficients in $K$. By Lemma 2, for all but infinitely many $t$ in $F$, the polynomials $f'(ty)$ and $g'(y)$ have no common factor. Choose a $t$ for which no common factor exists. Set $F' = F(alpha - t beta)$; our goal is to show that $F'=F(alpha, beta)$.



Set $h_t(x) = f(tx + alpha-t beta)$. Note that $h_t(x)$ has coefficients in $F'$. We consider the GCD of $h_t(x)$ and $g(x)$.



Working in $K$, we can write $h_t(x) = t (x - beta) f'(t (x- beta))$ and $g(x) = (x-beta) g'(x-beta)$. By the choice of $t$, the polynomials $f'(t (x- beta))$ and $g'(x-beta)$ have no common factor, so the GCD of $h_t(x)$ and $g(x)$, in the ring $K[x]$, is $x-beta$.



By Lemma 1, this shows that $x - beta$ is in the ring $F'[x]$. In particular, $beta$ is in $F'$. Clearly, $alpha$ is then also in $F'$, as $alpha= (alpha - t beta) + t beta$. We have never written down an element of any field larger than $K$. QED.

nt.number theory - Neukirch's class field axiom and cohomology of units for unramified extension

This question may be too detailed but perhaps somebody knows the answer: Neukirch proofs in his algebraic number theory book in chapter IV, proposition 6.2, that his class field axiom implies that the tate cohomology groups H^n(G(L|K),UL) for n=0,-1 vanish for finite unramified extensions L|K, where UL is the group of units. He mentions in the proof that every element a in AL can be written as a = epsilon * piK^m, where epsilon in UL and piK is a prime element in AK. Why does this work?
I absolutely understand this argument, when the image of the valuation just lies in ZZ! But how does this work for a valuation whose image is widehat{ZZ}? Unless A is not a profinite module, I don't know what piK^m is for some general m in widehat{ZZ}. Unfortunately this has to work in this generality for global class field theory.



(ZZ denotes the integers of course, sorry for my personal notation.)

Tuesday, 15 August 2006

ag.algebraic geometry - Relatively ample line bundles

EDIT : my previous answer was wrong. Thanks to BConrad for pointing it out.



Here is a counterexample if the map is not proper. Let $X$ be the plane, let $Y$ be the blow-up of the plane in one point, and $U$ be $Y$ with one point of the exceptionnal divisor removed. Let $f|_U:Uto X$ be the projection and consider the line bundle $mathcal{O}_U$.



Since the fibers of $f|_U$ are affine (either points or the affine line), $mathcal{O}_U$ becomes ample when restricted to fibers of $f|_U$. However, $mathcal{O}_U$ is not $f|_U$-ample.
Indeed, if it were, $mathcal{O}_U$ would be ample on $U$, but we can compute :
$$H^0(U,mathcal{O}_U^N)=H^0(U,mathcal{O}_U)=H^0(Y,mathcal{O}_Y)=H^0(X,mathcal{O}_X),$$



where the second equality comes from property $S2$ and the third holds because $f_* mathcal{O}_Y=mathcal{O}_X$. Hence, $H^0(U,mathcal{O}_U^N)$ cannot distinguish between two points of the exceptionnal divisor, and $mathcal{O}_U$ cannot be ample on $U$.




Warning : what follows is false. I kept it here so that the comment below remains understandable.



Here is a counterexample if the map is not proper. Consider the inclusion $f$ of the plane minus a point $U$ in the plane $X$. The line bundle $mathcal{O}_U$ is ample restricted to the fibers of $f$ (they're points...). However, it is not $f$-ample. Indeed, if it were, $mathcal{O}_U$ would be ample on $U$. Choosing $N>>0$, we would get $$H^1(U,mathcal{O}_U)=H^1(U,mathcal{O}_U^N)=0.$$
But a simple computation via Cech cohomology shows that this cohomology group is not trivial (in fact, infinite).




ra.rings and algebras - A condition that implies commutativity

Let $R$ be a ring. A notable theorem of N. Jacobson states that if the identity $x^{n}=x$ holds for every $x in R$ and a fixed $n geq 2$ then $R$ is a commutative ring.



The proof of the result for the cases $n=2, 3,4$ is the subject matter of several well-known exercises in Herstein's Topics in Algebra. The corresponding proofs rely heavily on "elementary" manipulations. For instance, the proof of the case $n=3$ can be done as follows:



1) If $a, b in R$ are such that $ab= 0$ then $ba=0$.



2) $a^{2}$ and $-a^{2}$ belong to $mathbf{Z}(R)$ for every $a in R$.



3) Since $(a^{2}+a)^{3} = (a^{2}+a)^{2}+(a^{2}+a)^{2}$ it follows that



$a=a+a^{2}-a^{2} = (a+a^{2})^{3}-a^{2} = (a^{2}+a)^{2}+(a^{2}+a)^{2}-a^{2}$



and whence the result. ▮



Certainly, the mind can't but boggle at the succinctness of the above solution. Actually, it is the conciseness of this argument that has prompted me to pose the present question: is an analogous demonstration of the general theorem possible? The one that appears in [1] depends on some non-trivial structure theorems for division rings.



As usual, I thank you in advance for your insightful replies, reading suggestions, web links, etc...



References



[1] I. N. Herstein, Noncommutative rings, The Carus Mathematical Monographs, no. 15, Mathematical Association of America, 1968.



[2] I. N. Herstein, Álgebra Moderna, Ed. Trillas, págs. 112, 119, and 153.

Monday, 14 August 2006

ag.algebraic geometry - For a given finite group G, is there a cover of P^1 over Q s.t. over C it's G-Galois?

I don't know about every finite group $G$ (I'll guess no), but there are definitely infinitely many finite groups $G$ for which the situation you describe obtains: the extension $K/mathbb{C}(t)$ has a model over $mathbb{Q}$ but is not Galois over $mathbb{Q}$. (And for most of these groups, we do not know how to realize them as Galois groups over $mathbb{Q}$, regularly or otherwise.)



For instance, this is the situation in a work in progress of John Voight and me:



http://math.uga.edu/~pete/triangle-091309.pdf



In our slightly different language, there are plenty of situations where the covering itself is defined over $mathbb{Q}$ but the field of definition of the automorphism group $G$ is strictly larger. (This is equivalent to what you're asking, right? Please let me know.)



[Warning: recently, with the help of Noam Elkies, John and I realized that our arguments as given only work when (in our notation) $a = 2$. This is still a generalization of the setting in which I began this work some years ago: I had $a = 2$, $b = 3$, so I know for sure that there are infinitely many examples of this form.]

set theory - The isomorphism inference rule

First of all, I suspect that whenever a formal system has such an "isomorphism inference rule," all proofs using that rule can be converted to proofs not using it. (I don't know the details of Bourbaki's set theory, though.)



So what could an "isomorphism rule" look like? First of all, you need a metamathematical characterization of what constitutes an "isomorphism" between arbitrary "structures." I guess category theory can answer this question, but only if you encode your specific class of structures as a category. So that does not really define isomorphism on a metamathematical level. (Unless, apparently, one works within fully categorial foundations: http://cs.nyu.edu/pipermail/fom/2003-July/007064.html)



However, there is indeed a comparatively simple way to characterize isomorphisms. This is actually part of a formal system I have developed for a proof assistant, but you can apply the same principle in naive set theory if you don't mind a little vagueness. (Don't try to apply it to first-order axiomatic set theory, though.)



There is one rule of this system that matters here: Roughly speaking, for given x and S, you may not ask whether x is a member of S unless x was introduced as a member of some superset of S. The introduction of a variable as a member of a given set is considered primitive, so the rule is purely syntactical. An example:



We want to say: "Let S be the intersection of the set of primes with the set of even integers, and n be a member of S. Then n is in N." Surely, the set of primes is defined as some subset of N, which in turn is a subset of Z, and the set of even integers is also defined as a subset of Z. So n is syntactically known to be a member of Z, and we can ask whether it is also in N. However, we cannot ask whether it is e.g. in the set of finite graphs. Or whatever else. Formally, this amounts to a type system.



Another aspect of the formal system is that when you want to define what a "group" is, you need to specify when two "groups" are considered equal. So suppose you are given two sets S and T, as well as group operations on each. There is no syntactically "known" superset, so you cannot ask whether S=T because that would involve taking an element of S and asking whether it is in T. Now convince yourself that the most you can do is ask whether the two groups are isomorphic. That is, no other formula you can come up with will ever distinguish two isomorphic groups. (It is required to be reflexive, symmetrical, and transitive, of course.)



To conclude, it is possible to construct a system in which you cannot even talk about structures except up to isomorphism, for arbitrary structures. (That does not mean you cannot take their concrete sets into account in special cases. For example, if you have a group with two isomorphic subgroups, these will be considered equal as groups, but the sets can be different. Note how it makes sense to ask whether they are equal or different because we know a common superset.)



Now here is my equivalent of your "isomorphism inference rule": Since two isomorphic groups are in fact equal in this system, any property you can specify about one of them will be considered true of the other.

tqft - Separable and Fin. Gen. Projective but not Frobenius?

Let R be a commutative ring, and A an R-algebra (possibly non-commutative). Then A is separable if it is (fin. gen.) projective as an (A tensor_R A^op)-algebra. Suppose further that A is fin. gen. projective as an R-module. Does this imply that A is a (symmetric) Frobenius algebra?



There are lots of equivalent definitions of a Frobenius algebra. One (assuming A is a f.g. projective R-module) is that there exists an R-linear map tr: A --> R, such that b(x,y) := tr(ab) is a non-degenerate.



I know that the answer is yes when R is a field. What about other rings?



I am not an expert on algebras, but this question is related to understanding obstructions for extended TQFTs, and so I am very interested in knowing anything I can about it.

ac.commutative algebra - Elementary / Interesting proofs of the Nullstellensatz

I have been thinking about the question "What is the best -- i.e., some combination of shortest, most natural, easiest -- proof of the Nullstellensatz?" recently on the eve of a commutative algebra course.



In my notes up to this point I had been following Kaplansky's treatment of Goldman domains and Hilbert-Jacobson rings. This places the problem in a more general context and allows for an attractively thorough analysis. At the end one comes out with the following results:



(1) The polynomial ring $k[t_1,ldots,t_n]$ is a Jacobson ring -- i.e., every radical ideal is the intersection of the maximal ideals containing it.



(2) (Zariski's Lemma): If $mathfrak{m}$ is a maximal ideal of $k[t_1,ldots,t_n]$, then $k[t_1,ldots,t_n]/mathfrak{m}$ is a finite degree field extension of $k$.



(That Hilbert's Nullstellensatz follows from (1) and (2) is an easy, standard argument that I won't discuss here.)



But it is well-known that to prove the Nullstellensatz one needs only (2), because then (1) follows by a short and easy argument that everyone seems to like: Rabinowitsch's Trick. So perhaps this is a sign that developing the theory of (Hilbert-)Jacobson rings to prove the Nullstellensatz is overkill.



So the question seems to be: what is the best proof of Zariski's Lemma?



After looking around at various proofs, here is what I think the answer is now: it is an easy consequence of the following result.



Theorem (Artin-Tate Lemma): Let $R subset T subset S$ be a tower of rings such that
(i) $R$ is Noetherian,
(ii) $S$ is finitely generated as an $R$-algebra, and
(iii) $S$ is finitely generated as a $T$-module.
Then $T$ is finitely generated as an $R$-algebra.



Proof: Let $x_1,ldots,x_n$ be a set of generators for $S$
as an $R$-algebra, and let $omega_1,ldots,omega_m$ be a set of
generators for $S$ as a $T$-module. For all $1 leq i leq n$, we may
write
begin{equation}
label{ARTINTATEEQ1}
x_i = sum_j a_{ij} omega_j, a_{ij} in T.
end{equation}
Similarly, for all $1 leq i,j leq m$, we may write
begin{equation}
label{ARTINTATEEQ2}
omega_i omega_j = sum_{k} b_{ijk} omega_k, b_{ijk} in T.
end{equation}
Let $T_0$ be the $R$-subalgebra of $T$ generated by the $a_{ij}$ and $b_{ijk}$. Since $T_0$ is a finitely generated algebra over the Noetherian ring $R$, it
is itself a Noetherian ring by the Hilbert Basis Theorem. indent
Now each element of $S$ may be expressed as a polynomial in the $x_i$'s
with $R$-coefficients. Making substitutions using the two equations above shows that $S$ is a finitely generated $T_0$-module. Since
$T_0$ is Noetherian, the submodule $T$ is also finitely generated as a $T_0$-module. This immediately implies that $T$ is finitely
generated as a $T_0$-algebra and then in turn that $T$ is finitely generated
as an $R$-algebra, qed!



Proof that Artin-Tate implies Zariski's Lemma:



It suffices to prove the following: let $K/k$ be a field extension which is finitely generated as a $k$-algebra. Then $K/k$ is algebraic. Indeed, suppose otherwise: let $x_1,ldots,x_n$ be a transcendence basis for
$K/k$ (where $n geq 1$ since $K/k$ is transcendental), put $k(x) = k(x_1,ldots,x_n)$ and consider the tower of rings



$k subset k(x) subset K$.



To be sure, we recall the definition of a transcendence basis: the elements $x_i$ are algebraically independent over $k$ and $K/k(x)$ is algebraic. But since $K$ is a finitely generated $k$-algebra, it is certainly a finitely generated $k(x)$-algebra and thus $K/k(x)$ is a finite degree field extension. Thus the Artin-Tate Lemma applies to our tower: we conclude that $k(x)/k$ is a finitely generated $k$-algebra. But this is absurd. It implies the much weaker statement that $k(x) = k(x_1,ldots,x_{n-1})(x_n)$ is finitely
generated as a $k(x_1,ldots,x_{n-1})[x_n]$-algebra, or weaker yet, that
there exists some field $F$ such that $F(t)$ is finitely generated as
an $F[t]$-algebra: i.e., there exist finitely many rational functions
${r_i(t) = frac{p_i(t)}{q_i(t)} }_{i=1}^N$ such that every rational function
is a polynomial in the $r_i$'s with $k$-coefficients. But $F[t]$ is a PID with infinitely many nonassociate nonzero prime elements (e.g. adapt Euclid's argument of the infinitude of the primes), so we may choose a nonzero prime element $q$ which does not divide $q_i(t)$ for any $i$. It is then clear that $frac{1}{q}$ cannot be a polynomial in the $r_i(t)$'s: for instance, evaluation at a root of $q$ in $overline{F}$ leads to a contradiction.



Note that this is almost all exactly as in Artin-Tate's paper, except for the endgame above, which has been made a little more explicit and simplified: their conclusion seems to depend upon unique factorization in $k[t_1,ldots,t_n]$, which does not come until later on in my notes.



Further comments:



(i) The proof is essentially a reduction to Noether's normalization in the case of field extensions, which becomes the familiar result about existence of transcendence bases. Thus it is not so far away from the most traditional proof of the Nullstellensatz. But I think the Artin-Tate Lemma is easier than Noether Normalization.



(ii) Speaking of Noether: the proof of the Artin-Tate Lemma is embedded in the standard textbook proof that if $R$ is a finitely generated $k$-algebra and $G$ is a finite group acting on $R$ by
ring automorphisms, then $R^G$ is a finitely generated $k$-algebra. In fact I had already typed this proof up elsewhere in my notes. Realizing that the Artin-Tate Lemma is something I was implicitly proving in the course of another result anyway was part of what convinced me that this was an efficient route to the Nullstellensatz. Note that the paper of Artin and Tate doesn't make any connection with Noether's theorem and conversely the textbooks on invariant theory that prove Noether's theorem don't seem to mention Artin-Tate. (However, googling -- Artin-Tate Lemma, Noether -- finds several research papers which allude to the connection in a way which suggests it is common knowledge among the cognoscenti.)



Added: It turns out this is the proof of Zariski's Lemma given in Chapter 7 of Atiyah-Macdonald. I had missed this because (i) they give another (nice) proof using valuation rings in Chapter 5 and (ii) they do not attribute the Artin-Tate Lemma to Artin and Tate, although their treatment of it is even closer to the Artin-Tate paper than mine is above. (In the introduction of their book, they state cheerfully that they have not attributed any results. I think this is a drawback of their otherwise excellent text.)

Sunday, 13 August 2006

dg.differential geometry - How can one best link a circle to a straight line by an arc of continuously varying curvature?

Are there well known plane curves with equations of the form y=f(x), which specify an arc s in the Cartesian plane E-having distinct end points P1,P2-so that the following conditions may be satisfied?



Let R be a given positive real number. Let C be a circle in E having radius R, which does not intersect
the x-axis and has its center on the positive y-axis. f(x) is a strictly decreasing function of x and
the curvature of its graph s is a continuous and strictly decreasing function of arc length. P1 is the only
intersection point of s and C. s and C have the same tangent at P1 and the curvature of s at P1 is 1/R.
s is tangent to the x-axis at P2 and its curvature at P2 is zero.

Saturday, 12 August 2006

nt.number theory - Can we color Z^+ with n colors such that a, 2a, ..., na all have different colors for all a?

I suggested in a comment, to look for colorings that are the same for each a, except for a fixed permutation.



Suppose the colors are given by function c(x), for x ≥ 1. And a permutation function p(i,j), where i,j ∈ [1..n]. Where i is the index of the permutation, and j the index of the permutation of the color.



If n = 4 and the coloring starts with abcd, then p(1,1) = a, p(1,2) = b, p(1,3) = c, p(1,4) = d, p(2,1) = b, p(2,2) = d.



If the coloring is the same for each a, then we get the following equation:



c(an) = p(c(a),c(n))



It is more important to give a condition when the above condition is met.



Lemma 1: If p(x,y) is reflexive (p(x,y)=p(y,x)) and associative (p(x,p(y,z))=p(p(x,y),z)) and p(x,y)=p(1,xy) for xy ≤ n, then a coloring can be constructed.



Lemma 2: For each n a permutation exists such that it reflexive and associative and for p(x,y)=p(1,xy) for xy ≤ n.



To see how this works, for n = 4, we start the following permutation matrix:
abcd
bd..
c...
d...



Fill in the second row:
abcd
bdac
ca..
dc..



And complete it:
abcd
bdac
cadb
dcba



From this construct the coloring:
abcdxaxcdxxbxxxa



By each prime number a > n (the x values in the coloring), you can choose a color for c(a), but you have to continue with the permutation of that color for c(ma).



Lemma 1 is trivial to prove, because the associative and reflexive conditions makes that reaching a number by different values of a, factoring the number and re-arranging makes that it must have the same value. For the values xy ≤ n, you can't factor anymore, and the condition must just met.



I haven't proven lemma 2 fully. A permutation matrix with reflexive and associative conditions, can be constructed by starting with 1 permutation that has a single cycle. The full matrix can be constructed by applying this permutation multiple times, up to n.
It can be proven, that such matrix has the reflexive and associative condition, but not necessary the condition that p(x,y)=p(1,xy) for xy ≤ n;



The solution of François as matrix looks like this:
abcdef
bdface
cfbead
daebfc
ecafdb
fedcba



The permutation from first row to second row is not a permutation with a single cycle. It brings you from row 1, to row 2 to row 4, back to row 1. However, this can still be categorized as 1 cycle permutation, because the permutation from first row to third row, is a permutation with a single cycle.



Instead of looking at the permutation per color, it is better to look which colors are passed, when the permutation is applied multiple times. In above example the cycle is (first row to third row) a->c->b->f->d->e->a.



If we just make the second row a permutation with one cycle, then the cycle starts with:
1->2->4->8->16->32->64 etc. Then we can add 3 and the other primes.



For n = 27 we can get:
1->2->4->8->16->3->6->12->24->x->9->18->5->10->20->27->x->15->7->14->11->22->x->21->25->13->26->1



In above sequence, multiplications with 2, have 1 step, multiplication with 3, 5 steps, multiplications with 5 have 12 steps. The remaining primes 17, 19, 23 can be placed on any of the x values. The short parts 11->22 and 13->26 can easily be exchanged. From this cycle, you can make a permutation and permutation matrix that has the condition that p(x,y) = p(1,xy) for xy ≤ n. From that permutation the coloring can be constructed.



As you can see, it is not very difficult to construct a coloring this way. But, the sequence is also rather crowded. It is not a prove yet that it is always possible.

Friday, 11 August 2006

na.numerical analysis - Optimum small number for numerical differentiation

I use the technique given by J.C. Nash in the book "Compact Numerical Methods for Computers".



On page 219, there is the formula



$h=sqrt{epsilon}left(|x|+sqrt{epsilon}right)$



where $epsilon$ is of course machine epsilon. I have also seen the following formula used (apologies, I no longer recall where this was from):



$h=sqrt{epsilon}maxleft(|x|,sqrt{epsilon}right)$



The recipe for second derivatives is similar except that one now uses the cube root of machine epsilon instead of its square root.

Thursday, 10 August 2006

measure theory - Separable sigma-algebra: equivalence of two definitions

The two notions are not equivalent. Indeed, they are not
equivalent even when one considers completing the measure
by adding all null sets with respect to any countably
generated $sigma$-algebra. Nevertheless, the forward
implication holds.



First, let me explain the forward implication. Suppose that
$S$ is a $sigma$-algebra generated by a countable
subfamily $S_0$ and $mu$ is a finite measure defined on
$S$. The semi-metric on $S$ is defined by
$d(A,B)=mu(Atriangle B)$. Let $S_1$ be the collection of
finite Boolean combinations of sets in $S_0$. This is a
countable family, and I claim it is dense in the
semi-metric. To see this, let $S_2$ be the closure of $S_1$
in the semi-metric, that is, the sets $Ain S$ that are
approximable by sets in $S_1$, in the sense that for any
$rgt 0$ there is $Bin S_1$ such that $d(A,B)lt r$. Note
that $S_2$ contains $S_1$ and is closed under complement
since the measure was finite. I claim it is also closed
under countable unions: if each $A_n$ is approximable by
$B_n$ to within $r/2^n$, then $cup_n A_n$ is approximated
by $cup_n B_n$ to within $r$, and so one may find an
approximating finite union. So $S_2$ is actually a
$sigma$-algebra, and since it contains $S_0$, it follows
that $S_2=S$. That is, every set in $S$ is approximable by
sets in $S_1$, and so $S_1$ is a countable dense set in the
semi-metric, as desired.



Let's turn now to the reverse implication, which is not
generally true. The easiest counterexample for this in the
strict sense of the question is to let $X$ be a set of size
continuum and $S=P(X)$, the full power set of $X$. This is
a $sigma$-algebra, but it is easily seen not to be
countably generated on cardinality grounds. Fix any $pin
X$ and let $mu$ be the measure placing mass $1$ at $p$ and
0 mass outside {p}. In this case, the family {emptyset, X}
is dense in the semi-metric, since every subset is
essentially empty or all of $X$, depending on whether it
contains $p$. So the semi-metric is separable, but the
$sigma$-algebra is not countably generated.



Note that in this counterexample, the $sigma$ algebra is
obtained from the counting measure on {p} by adding a large
cardinality set of measure $0$ and taking the completion.
Similar counterexamples can be obtained by adding such
large cardinality set of measure $0$ to any space and
taking the completion.



At first, I thought incorrectly that one could address the
issue by considering the completion of the measure, and
showing that the $sigma$-algebra would be contained within
the completion of a countably generated $sigma$-algebra.
But I now realize that this is incorrect, and I can provide
a counterexample even to this form of the equivalence.



To see this, consider the filter $F$ of all sets $Asubset
omega_1$ that contain a closed unbounded set of countable
ordinals. This is known as the club filter, and it is
closed under countable intersection. The corresponding
ideal $NS$ consists of the non-stationary sets, those
that omit a club, and these are closed under countable
union. It follows that the collection $S=Fcup NS$, which
are the sets measured by a club set, forms a
$sigma$-algebra. The natural measure $mu$ on $S$ gives
every set in $F$ measure $1$ and every set in $NS$ measure
$0$. This is a countably additive 2-valued measure on $S$.
Note that every set in $S$ has measure $0$ or $1$; in
particular, there are no disjoint positive measure sets. It
follows that the family {emptyset,$omega_1$} is dense in
the semi-metric, since every set in $S$ either contains or
omits a club set, and hence either agrees with emptyset or
with the whole set on a club. Thus, the semi-metric is
separable. But for any countable subfamily $S_0subset S$,
we may intersect the clubs used to decide the members of
$S_0$, and find a single club set $Csubsetomega_1$ that
decides every member of $S_0$, in the sense that every
member of $S_0$ either contains or omits $C$. This feature
is preserved under complements, countable unions and
intersections, and therefore $C$ decides every member of
the $sigma$-algebra generated by $S_0$. The completion of
the measure on the $sigma$-algebra generated by $S_0$ is
therefore contained within the principal filter generated
by $C$ together with its dual ideal. This is not all of
$S$, since there are club sets properly contained within
$C$, such as the set of limit points of $C$. Thus, this is
a measure space that has a separable semi-metric, but the
$sigma$-algebra is not contained in the completion of any
countably generated $sigma$-algebra.

Wednesday, 9 August 2006

pr.probability - near independence of markov chain observations at high lags

From talking to statisticians, it seems like the standard thing to do is assume a yes. For a specified sequence or thinning, one can create Markov chains which exhibit 0 autocorrelation but a 'very large' amount of dependence despite the thinning, but the idea is that this should be pretty pathological, and so you are probably 'pretty safe'. The examples that come to mind for me are chains like '$X_{t}$ is iid 0-1 for t not divisible by a million, and equal to the sum of the previous 999,999 $X_{s}$ mod 2 for t divisible by a million'



So, you won't get any sort of theoretical justification without telling us something about your chain, because there really are bad chains out there, where what you want fails badly... but in the real world, you can probably wave your hands. Note that just increasing your thinning isn't enough to get rid of this - for any finite amount of thinning, there are bad chains which horribly violate independence despite having 0 covariance.



There is a big world of convergence diagnostics out there. Gelman-Rubin, as pointed out, is a standard. Some other standards include the Geweke test, Mykland-Yu's Cusum test, Raftery-Lewis, and Heidelberg-Welch. There is a lot of literature on each of these, but with the slight exception of some easy calculations in Mykland-Yu, it is all extremely handwavey.

Tuesday, 8 August 2006

computational number theo - Computation of low weight Siegel modular forms

We have these huge tables of elliptic curves, which were generated by computing modular forms of weight $2$ and level $Gamma_0(N)$ as N increased.



For abelian surfaces over $mathbb{Q}$ we have very little as far as I know. The Langlands philosophy suggests that every abelian surface should be attached to a Siegel modular form of weight $(2,2)$ on $GSp_4$, but the problem is that this weight is not cohomological, which has the concrete consequence that it's going to be tough to compute such things using group cohomology. In particular one of the reasons that modular symbols work for computing elliptic curves, fails in this situation.



I guess though that one might be able to somehow use the trace formula to compute the trace of various Hecke operators on Siegel modular forms of weight $(2,2)$ and various levels, because presumably the trace formula translates the problem into some sort of "class group" (in some general sense) computation, plus some combinatorics.



[EDIT: from FC's comment, it seems that my guess is wrong.]



Has anyone ever implemented this and tabulated the results?



[NB I know that people have done computations for low level and high weight, for example there's a lovely paper of Skoruppa that outlines how to compute in level 1; my question is specifically about the weights that are tough to access]

latex - Automatically updating PDF reader for Windows

So, when doing LaTeX, it is absolutely necessary for ones sanity to using a preview program which updates automatically every time you compile. Of course, any previewer designed for DVIs will do this, but as far I can tell, Adobe Acrobat not only does not automatically update, but will not let you change the PDF with it open.



On a Mac one can get around this by using Skim, and on *nix by using xpdf, but what should one do on Windows?

Monday, 7 August 2006

ac.commutative algebra - Primes in a (commutative) Jacobson ring

The result is true in general.



We may assume a counterexample is given in the form of a domain $R$ satisfying the second property but with nontrivial Jacobson radical, i.e. the closed points of Spec $R$ are not dense. Let D be an affine open neighborhood of $(0)$ in Spec $R$ which contains no closed points. Since D is affine, there exists some $xin$ D which is closed in D. That is,



$overline{lbrace xrbrace}setminus xsubsettext{Spec }Rsetminus D$.



Since $D$ is open, this implies



$xnotin overline{overline{lbrace xrbrace}setminus x}$,



but this contradicts the requirements of the second property.

Saturday, 5 August 2006

exposition - Is it best to run or walk in the rain?

According to the Norwegian meterological institute, the answer is that it is best to run. According to Mythbusters (quoted in the comments to that article), the answer is that it is best to walk.



My guess would be that this is something that can be properly modelled mathematically and solved. I suspect that I'm not alone in making a start on this, but as it's usually during a rainstorm that I think of it, I don't get far before deciding that the best strategy is to call a taxi!



I would also guess that someone has already done this and figured out the answer. If I'm right about that, I would like to read about it and learn what the real answer is. Thus my question is most likely: can someone provide a reference for this?



(If I'm wrong that it's already been solved, then some pointers as to how to solve it or even an explanation of what the main issues are, would be most interesting.)



As well as simply satisfying my curiosity, this is also one of those questions that non-mathematicians often ask (or at least, that non-mathematicians can easily understand the motivation for the question) so if there's a reasonable answer then it would be a good way to show off some mathematics.




Update This question has a long and curious history. There's a discussion about it at the meta site: http://tea.mathoverflow.net/discussion/90/question-about-walking-in-the-rain

it.information theory - reversible Turing machines

Using the method of the universal Turing machine, consider the Turing machine $S$ that on input $x$ simulates the computation of $T$ on $x$, and keeps track of the entire computation history. That is, $S$ writes down on the tape complete descriptions (tape contents, state, head position) of each successive step of the computation of $T$ on $x$. If eventually $S$ finds that $T$ accepts or rejects $x$, then $S$ should also accept or reject $x$.



This computation is reversible, since at any stage of the computation of $S$ on $x$, we can easily read off $x$ from the description of the first configuration, and so we know how $S$ started and therefore how we got to where we are.



Another way to do it would be the following: using a pairing function, regard the tape as two tapes, and then on input $x$, make a copy of $x$ on one of the tapes, and then with each step of simulated computation increment a counter on this tape (by adding one more $1$). This process leads to a reversible computation, since from any stage in the computation we can tell what the input was, and how many simulated steps have been performed. So we know where the computation came from.



These computation are reversible in the weaker sense that they can be reversed from the configurations that actually arise in computation, whereas your definition seems to require reversibility on all possible finite confgurations, even if these would not arise during an actual computation. Is that really what you want?

Wednesday, 2 August 2006

ac.commutative algebra - What is the most simple non-planar Gorenstein curve singularity?

I think Graham's answer already gave most of what you need to prove that $4$ is the smallest possible. Let $V$ be the integral closure of $R$, $n$ be the embedding dimension of $R$, and $e=e(R)$ be the multiplicity.



Claim: If $R=k[[x_1,cdots,x_n]]/I$ is Gorenstein and $n$ is at least $3$, then $dim_k(V/R)geq e$.



Proof: Let $m$ be the maximal ideal of $R$. As Graham pointed out, we have $e = dim_k(V/mV)$. So:



$$dim_k(V/R) =dim_k(V/mR)-dim_k(R/mR) geq dim_k(V/mV)-1=e-1$$



We need to rule out the equality. If equality happens, then one must have $mV=mR$. This shows that $m$ is the conductor of $R$. As you already knew, since $R$ is Gorenstein, one must then have $dim_k(V/R)=dim_k(R/m)=1$. The inequality now gives $eleq 2$. Abhyankar's inequality (part 2 of Graham's answer) gives $nleq 2$, so $R$ is planar, contradiction.



Now, one needs to show that for $R$ non-planar, $egeq 4$. You could use part $3$ of Graham's answer, or arguing as follows: if $ngeq 4$ we are done by Abhyankar inequality. If $n=3$, a Gorenstein quotient of $k[[x,y,z]]$ must be a complete intersetion, and so $I=(f,g)$, each of minimal degree at least $2$ since $R$ is not planar, thus $e$ must be at least $4$.



By the way, one could construct a domain $R$ such that $dim_k(V/R)=4$ as follows:
Take $R=k[[t^4,t^5,t^6]]$. The semigroup generated by $(4,5,6)$ is symmetric, so $R$ is Gorenstein. The Frobenius number is $7$, and $V/R$ is generated by $t,t^2,t^3,t^7$.



EDIT (references, per OP's request): Abhyankar inequality is standard, for example see Exercise 4.6.14 of Bruns-Herzog "Cohen-Macaulay rings", second edition (Link to the exact page). Or see exercise 11.10 of Huneke-Swanson book, also available for free here. Or Google "rings with minimal multiplicity".



(The original references are now available thanks to Graham, see his comment below)



As for $e=dim_k(V/mV)$, I could not find a convenient reference, but here is a sketch of proof using the above reference: First, using the additivity and reduction formula (Theorem 11.2.4 of Huneke-Swanson) to reduce to the domain case. Assume that $R$ is now a complete domain, then $V=k[[t]]$, and $R$ is a subring of $V$. Let $xin m$ be an element with smallest minimal degree. Then $mV=xV$ ($V$ is a DVR), and it is not hard to see that $e=$ the minimal degree of $x$ $=lambda(V/xV)$ (see Exercise 4.6.18 of Bruns-Herzog, same page as the link above).



Alternatively, one can use the fact that:
$$e(m,V) = text{rank}_RV.e(m,R) = e $$
The second inequality is because $V$ is birational to $R$ so $text{rank}_RV=1$. The left hand side can be easily computed by definition to be length of $V/xV$, which equals $dim_k(V/mV)$. (use $m^nV=x^nV$ since $V$ is a DVR)



Fun exercise!

st.statistics - Correlation and Causation. When can we believe correlation (reasonably, at least) imply causation

Many scientific disciplines are such that experiments are impossible. The effects of childhood abuse, for example. Scientists who study such matters are generally very sensitive to the fact that the empirical work they do cannot prove causation in the way that controlled studies can.



What a well-designed correlative study can prove or disprove is that of a long list of proposed explanations for some effect, one is the most likely. For example, you can make a study that determines which of the claims "A causes C" or "B causes C" is more likely. If you do enough of these, against every conceivable other thing that might effect C, then you can reasonably assert that you have evidence that A causes C, but you have to remember that it is a different kind of evidence from in-lab controlled experiments. (Sometimes it is better evidence: lab environments can be poor approximations of the actual world.) ((Yet another kind of evidence would be a proposed mechanism. If you can tell a convincing yarn about why A causes C, which builds on a series of well-established causal relationships, then you may claim you have evidence for your causal assertion. I think that this kind of evidence is generally the weakest, but it is often the one that people like the best, since most people understand the world through stories.))



But it is important to remember that just the existence of a correlation need not have much to do with a causal relationship. Here's one that comes to mind. It is a fact that a history of childhood victimization predicts for (is correlated with) lower adulthood weight (the effect is weak, but statistically significant). However, the best guess for the relationship between childhood victimization and weight is that victimization causes the victim to be overweight in adulthood. This result shows up in the empirical data after you "hold other variables constant". In particular, childhood victimization strongly correlates with tobacco use, and tobacco is known to make people thinner. But if you child-abuse-victimization with weight within either the tobacco-using population or the tobacco-non-using population, you do see a correlation between abuse and obesity. Again, it is always possible that there are other better explanations for this effect, and to test them you go out in the field and measure all your proposed variables. My understanding of the result "child abuse causes obesity" is that it is more likely than any other explanation that people have thought of, in the sense that every variable scientists have thought of to hold constant or covary or all the other things they can do in the statistical models seems to lead to the asserted conclusion.

Tuesday, 1 August 2006

nt.number theory - Time complexity of finding the GCD of a set S as a function of sum(S)

Sorting a set of $n$ elements into ascending order takes $O(nlog n)$. You ask for a bound in terms of the sum, call it $T$, of the elements, rather than their number. I suspect that if the elements are all ones and twos and they are sufficiently jumbled up then there is no way to significantly improve on the time required to order them, and of course $T$ is within a constant multiple of $n$, so you still get $O(Tlog T)$.



Edit: Exercise 36 in 4.5.3 of Knuth, Seminumerical Algorithms, may be relevant. The question asks, what is the smallest value of $u_n$ such that the calculation of $gcd(u_1,dots,u_n)$ [by the method of this question] requires $N$ divisions? The answer given is $u_n=F_{N-n+3}$, where $F_m$ is (I'm pretty sure) the $m$-th Fibonacci number. There's a reference to a paper of G H Bradley, CACM 13 (1970) 433-436, 447-448.

rt.representation theory - Geometric proof of Robinson-Schensted-Knuth correspondence?

This is more a comment on some of the answers given than an answer to the question itself (which isn't too clear in the first place: a correspondence is not something to prove, and why should one want a geometric proof of a statement (whichever) that is fundamentally combinatorial, unless it's just the right brain hemisphere that wants to join in the game?)



I have said this before, in print long ago, but since many people seem not to be aware I'll say it again: while a beautiful presentation, Viennot's description with shadows (ombres) is just a restatement of the "graph-theoretical viewpoint" that is given in Knuth's original paper (section 4), in a somewhat more geometric language. In fact I think the approach is really poset-theoretic: a permutation gives rise to a poset by embedding the $1$'s of permutation matrix into $mathbb N^2$ with the product partial ordering. The graph that Knuth considers is basically the restricted partial ordering, viewed as a directed graph (there is a minor complication for general integer matrices instead of permutation matrices, as they give rise to points with multiplicity). Viennot's partition into shadows is just the partition of the poset into anti-chains according to the "height" of its elements: the length of the longest chain ending in the element (which description is given in the Knuth paper). Once the shadows are defined, the construction proceeds by iterative formation of derived (and ever smaller) subsets of $mathbb N^2$/posets/digraphs in exactly the same way. Fulton's matrix-ball construction is again a reformulation of Viennot's construction, taking care to adapt to the general integer-matrix context of Knuth's paper; otherwise there is nothing new here either.



Personally I find the iterative nature of these constructions limit their transparency, even though it does not interfere with their proof of the symmetry of the RS / RSK correspondence. A similar "geometric" approach, but which really adds insight with respect to Knuth's paper is Fomin's growth-diagram approach (which can be given for the full RSK correspondence, even though it is often presented for permutations only). One has to consider all points of (the relevant part of) $mathbb N^2$ rather than just those occurring in the poset, and one has to attach partitions to each of them according to a somewhat subtle rule, but then the construction produces in one swoop (without iteration) the full pair of tableaux.