Friday, November 14, 2008

Bayesian Solutions to Everyday Problems

Ok. This is a new theme i have been considering writing about. It should be fun but what i need are evveryday problems for which information is uncertain and a decision must be made. Here is a very simple and oft discussed example. Suppose that you are on the show Let's Make a Deal. The scenario is as follows. You are shown three curtains and told behind the curtains are three prizes. One of the prizes is valuable the other two are generally not valuable. You are asked to pick a curtain. At this point you have a 1/3 chance of having selected the curtain with the prize behind it. You are then shown that behind one of the curtains that you did not choose is one of the crummy prizes. You are then asked whether you would like to stick with the curtain you initially chose or switch and choose the other remaining curtain.

If you stick with your original choice you still have a 1/3 chance of winning. On teh other hand, the prize had a 2/3 chance of being in one of teh other two curtains. This remains true even after you have been shown that a crummy prize was behind one of those two. As a result, the remaining curtain of the two still has a 2/3 chance of having the prize. Thus switching doubles your chances of winning. Neato!

Wednesday, November 12, 2008

iPod Touch/iPhone Update

By popular request, here is another post on the configuration of my iPod Touch and how i make use of it to enhance by my personal and professional lives. This post will be organized into two parts. The first will deal list my official appstore apps and webbased tools which i use. The second will just give a very quick overview of how to jailbreak the latest OS (and unlock for 2G phones) and the list which unofficial apps I find useful. FYI, its an increasingly small number these days.

Ok. My apples store apps:

(1) Google Calendar (and/or Contacts and/or Mail) syncing with Microsoft Exchange. I used to use the jailbreak app NemusSync for this purpose but the free NuevaSync services works even better. Directions for its use can be found here, but the essence of the procedure is that you create a free account with nuevasync.com give it permission to read and update your google calendar (by ticking an option). Then on your iPhone, goto Settings -> Mail, Contacts, Calendar. Then add an account select Microsoft exchange and enter you newly made nuevasync.com info. Anyway, detailed instructions can be found at nuevasync.com.

(3) Itunes Remote. Use you iPhone as a remote control for iTunes one a machine on a local network. Just start it up and follow the instructions. Works perfectly, is great for parties.

(3) Fring. Turns your iPhone into a internet phone allowing you to make calls (or send instant messages) using Skype or a SIP account. For SIP service i use Gizmo. But its the same cost as skype out (~2cents a minute for international calls) so if you Prefer skype it is unnecessary to sign up for it.

(4) AirSharing. Basically, if you want to read papers while you fly, or carry around maps of the subway on you iphone for those occasions when you have neither cell service or wifi access, then this is the program for you. Once installed and properly configured it makes your iphone or touch show up as a networked hard drive on your local area network. Using your laptop you can then drag and drop files to your iPhone/iPod Touch for later viewing. Supports PDF's, Word Docs, Excel, and PPT so you can even upload a presentation and give the smallest talk ever. Install the app through the app store and then follow the instructions which appear when you start up the app for the first time.

Thursday, May 1, 2008

Zotero

I often extol the virtues of Zotero as the finest tool for tracking references and storing and sorting electronic copies of your papers in a searchable indexable tagable way all within Firefox. For sorting new papers the Zotero button in firefox just makes things too easy. But for your existing library of pdf and ps files its a bit more of a hassle.

However, if you're in need of a monumental act of procrastination (as I was today) the following procedure seemed to work pretty well:

If the pdf already has a bibliographic entry in Zotero then just drag the file onto the Zotero entry to associate it. You can then delete the original file as Zotero has made a copy. If not, then...
1. Right Click in the Collections window and Create A Collection called Unsorted Files
2. Drag and Drop all your articles into this collection
3. Go make some coffee or something while Zotero copies the files to its indexing directory. Firefox may ask you if it should continue or cancel the script...just click to continue
4. Open up the new Collection you created Double Click a PDF to view the pdf in your browser
5. You can now enter the bibliographic information by hand
6. Alternatively if your looking at an academic article (i know i usually am) then it probably has a doi. If your very lucky that doi is blue indicating it is clickable if it is click it. This almost always takes you to a webpage that Zotero knows how to read. If it does then just click on the Zotero button to the right of the url at the top. If it doesn't then google the doi or title and author and you will likely find a page that does have a Zotero button you can click
7. This created a new entry that has the correct bibliographic information and placed it into the Unsorted Collection. Now just drag the pdf file you started with onto the new entry and see that it placed the pdf under new entry in the tree structure.
8. Finally, drag the new Zotero entry to the appropriate collection and then right click to Remove the selected item from the Unsorted Files collection (do not delete from library)
Note this creates a new copy of the pdf file in the Zotero Directory so you can now delete the original file if you like. Anyway, this was eight steps, but once you get going it does go pretty quickly. If you have a better solution or know how to automate this please let me know.

Tuesday, April 29, 2008

Vista Crash Counter

My new laptop (a Dell XPS M1350) arrived Wednesday April 23 with Vista on it. It really didn't seem like that much of an improvement over XP (though I do like the launcher). Anyway, I decided to keep it that way just to try it out. So far:

Blue Screens of Death: 1
Desktop/Explorer Restarts: 2
Firefox Crashes: 1
Itunes Crashes: 1

Now what I did to get Firefox and Itunes to crash would have happened on my XP box but a Blue screen on day two was not terribly reassuring. That said I had just installed a bunch of software.....

Monday, April 21, 2008

Mathematical Reasoning in Philosophy

For commentary later:

http://plato.stanford.edu/entries/mathematics-explanation/

still not sure what is means by explanation, but I guess this is the problem.

discussion about the metaphysical significance of latent variables, i.e. that there is none.
also, discuss intuitive priors and intuitive proofs, why they are very uesful but none-the-less should never be trusted. Discuss criticisms of indespensibility of mathematics. is "reasoned fact" a meaningful statement outside mathematics.

Tuesday, April 15, 2008

Info here. $60 sounds like a lot...but then again screwing the man usually costs significantly more. Update: Currently more trouble than it is worth. Friday, April 4, 2008 Iphone/Ipod Hacking After a demonstration of the sheer utility of the IPod touch, AP has taken on the task of trying to convince NIH that an IPhone is really just a pocket pc and is, therefore, as much a work related expense as a laptop. I don't know if this will fly, but if you just list the features of a hacked Iphone, leaving out the phone part, it certainly does sound like an efficiency enhancing device. I certainly feel that way about my new life partner, SuperFunky. He keeps me on time for appointments, up to date with new issues of my favorite journals, helps me review papers on the airplane, check my email and the progress of jobs running on the cluster. Not only that, but like me he looks good in leather. Anyway, in response to a few requests, i am posting the procedure which i followed to get you and your iphone into a state of sheer mobile computing bliss: Step 1: Prep Update Itunes to the latest version. Plug in Iphone and update it to the latest version (1.1.4) Exit Itunes Step 2: Jailbreak and Unlock Download ziphone Run the Program, Follow the instructions (This should only involve one mouse click). Iphone Step 3: Unlock Iphone Power off Iphone by holding top and bottom buttons until slider appears... Use paperclip to pop out sim card Put in sim card of choice. Turn on Iphone with top button Use Iphone to call your friends to boast of your accomplishments. Ipod Touch Step 3: Install IPhone Apps Start Installer: Click Install at bottom. Scroll down to iPhone 1.1.4 Applications and select it. Select iPhone 1.1.4 Apps. Then click install in the upper right hand corner. ****You may need to install the community sources and big boss's recommended. Step 4: Installing Terminal app Now the fun part. Most of the applications you would like to install can be found through the install app icon which ziphone provided (its the blue one). The problem is that not all of them work and not all of them work together very well. I have traced most of the problems i have had to ownership and file permission issues, which likely result from Apple's creation of a secondary user called "mobile" which does not have root access. Fortunately, straight foreward use of the terminal app (or ssh) can remedy these issues. Ziphone installed ssh, but the terminal app is something you must do yourself. The terminal app in Installer seems to be broken, but a second installer app, Cydia, has one which works just fine. So first we will install Cydia. Here are steps. (0) First goto Click on the "Settings" Icon. Select "General" then set "Auto-lock" to "Never" (1) Open Installer App. Click on "Sources" in lower right corner, then "Edit" in the upper right, then "Add" in the upper left. Enter the url: http://apptapp.saurik.com/ (2) After the sources are refreshed. Click "Install". Under the "System" Menu, you should find "Cydia Packager". Select Cydia and then click "Install" in the upper right. (3) Once its completed, exit the installer. A new icon labeled cydia should have appeared. Click on it. Select "Install" and then "All Packages". Scroll down to "MobileTerminal" and then select it and then press "Install" in the upper right. (4) While your here i would suggest also installing "lighttpd", "BossPrefs", "BossPrefs Lighttpd..", "BossPrefs Safari DL...", "1.1.3/4 Safari Patch", "Python" and, of course, "PuzzleManiak" Step 4: Fixing broken stuff. As mentioned there are a few bugs associated with bad file permission stuff which are relatively easy to fix now that you have the terminal app up and running. So reboot your iphone and then start the terminal app. Then execute the following commands: mkdir Media/Downloads su alpine cd rmdir Media ln -s /var/mobile/Media /var/root/Media ln -s /var/mobile/Media/Downloads /var/root/Downloads ln -s /var/mobile/Media /var/root/Sites/Media chmod -R og+rx . cd /var/mobile chown -R mobile . chmod -R og+rx . exit Step 5: Installing more stuff with Installer. Start up installer, hit install at bottom. Scroll down to "Sources" and install "community sources" and "Bigboss's Recommended...". Then install: Under "Copy Coders" install "1.l1.2 - Download Plugin". Under "Development" install "PHP", Under "Productivity" install "Lockbox" and "MobileToDoList2" Under "Toys" install "Flashlight" Under "Tweaks" install "PDF/DOC/XLS viewer", "Weather Icon Fix" Under "Multimedia" install "PocketTcouch" and "SendSong (ver 0.31)", "Pocket Touch" Under "Utilities" install "DropCopy (ver.0.461)" and "Sendfile (ver.0.37) Step 6: To navagate your media files using safari open up "bossprefs" turn SafariDL to OFF and Lighttpd to ON. Then open up safari and open up http://127.0.0.1 You should be able to view any files that safari can view including pdf's and doc's. Most of the other programs are relatively self explanatory. The flashlight is particularly useful, as in drop copy. Also, since ssh is installed, a program like WinSCP can be used to easily transfer files to and from the iphone via its wireless connection. The ip address of the phone can be obtained from the bossprefs windows. Also, when surfing the web you can download some files (mostly media files) by turning ON bossprefs safari download option. Clicking on media files will cause them to be downloaded to the Media/Downloads directory. Step 7: Install simplify mobile media which is under Multimedia in the installer. This app will allow you to stream media from your pc to your iphone wherever you are. When this first came out the poor interface made it useless so I was using the web-based mycast.orb.com solution. Now Simplify Media has a very nice interface. Install it on your phone, then download and install the server software on your pc. It's found here. Create an account (share it with me!) and we're off. Also here is a workaround for copy and paste. Its not exactly elegant but it works in a pinch. Monday, March 24, 2008 Latex to PNG A converter for making png files from latex equations for easy import to html docs can be found here. Here is an example of the output: You can also go here for a more powerful version of the same. Correlations vs Coincidences ***THIS IS A WORK IN PROGRESS POSTED FOR REVIEW ONLY**** Reviewing a paper today which makes a very common error, one that i will be discussing in an upcoming talk I have been putting together so I thought I would take this opportunity to write a little on the issue of correlations and coincidences and their consequences for neural computation and decoding. A typical discussion of correlations rightly notes that correlations can affect the information content of a neural code. This is trivially demonstrated by considering the information content of a Gaussian distributed population r. Here$\mu(s)$is the stimulus conditioned mean and$\Sigma(s)$is the stimulus conditioned covariance. This expression has two terms. The first we label "linear" fisher information and the second we label the "quadratic" contribution to fisher information. This is because the first term represents the inverse variance of the unbiased, locally optimal, linear estimator of the stimulus. Here linear means the estimator is parameterized by$s_{est}=w*r+b$, optimal mean minimum variance, and unbiased means$w*\mu'(s)=1$. In a similar fashion, the total Fisher information (linear + quadratic) gives the inverse of the minimum variance associated with an un-biased estimator which operates on quadratic functions of r. Clearly, the form of this expression indicates that, at least for Gaussian distributed populations, correlations affect both the information content of the population and the form of the optional decoder. Indeed, a bit more work can be used to show that the weights of the optimal estimators take the form: Here repeated indices imply summation and w^lin_i operates on r_i while w^quad_ij operates on r_i*r_j Support Vector aficionados will also note that this computation hints at their favorite trick. Specifically, if we defined z to be a vector which contains each element of the vector r and every cross product r_i*r_j, then all the information about s contained in z is linear information, i.e.$I_r(s) = I_z^{lin}(s)$and so linear discriminates in this augmented z space can optimally estimate/discriminate stimuli. This transformation also indicates how to estimate fisher information in a non-parametric way. This is important when estimating information from data. Indeed, this issue came up the other day when JD was comparing different methods for estimating Fisher information. Among the methods considered (see below someday) one was the direct method. This involves taking data from two nearby values of the stimulus s and s+deltas. Then simply computing the empirical mean and covariance and their derivatives (differences) and plugging into the expression above. The problem, is that this only works when you know for a fact that your data is Gaussian distributed. This allows you to put some extra information into the computation, specifically, you are allowed to assume that the third central moments are actually zero and that the fourth central moments can be computed from the second moments (regardless of their empirical values). Of course, aprior you have no good reason to assume gaussian and you must utilize your estimates of these higher moments, which will be quite crappy when you are data limited. Anyway, the non-parameteric estimate of Fisher information contained in the first N moments can be estimated by defining the vector function T(r) as a vector which spans the set of polynomials of order N, i.e. for N=2, T(r) contains each r_i and every possible r_ir_j cross product. Fisher information is then obtained from the generative model for which where and <>_s indicates stimulus conditioned average. In this case information about the parameters theta takes the form and information about the stimulus is given by This equation clearly indicates that a reasonable direct estimation of the information content of in the first N moments requires good estimation of the stimulus dependence of the of the first 2N moments and then using the above equation. Or in the case of Gaussian distributed data using the first equation in this post. An important question arises when you are not sure whether or not your data is Guassian or whether or not higher moments are informative. Indeed it is possible to show that, even when all the information is linear Fisher information, it is still possible that all the moments of the data depend upon the stimulus. Indeed, direct calculation can be sued to show that where is the Nth central moment. Now most methods for computing information effectively place a prior on the stimulus dependence of higher moments that takes this form. If that prior is correct then direct estimation will work well. For example, if the data is actually Gaussian then using the first equation in this post will give an accurate estimation of the information when you have enough data to get a reasonable estimation of the first and second order statistics. However, when the data is not Gaussian, then using this equation will not necessarily lead to a correct estimation , especially when the empirically observed third and fourth order statistics are inconsistent with the Gaussian assumption. This particular issue is only relevant to the estimation of quadratic information. There is also a case for which direct estimation of linear fisher information is problematic. This is when correlations are small (or have small variability) but none-the-less have a strong influence on the information content. For example, this is the case when I_diag is significantly less than I and the correlations are on the order of the inverse of the number of units. In this case estimation of correlations is unreliable and so also is direct estimation of information based upon this estimation. Indirect estimation of information, on the other hand, is based upon computing theta'(s) subject to some reasonable prior. Typically priors which favor small weights values of theta'(s) are used. As is clear above theta' can be estimated by inverting a covariance matrix. But this inversion is can be complicated and estimation of the covariance matrix can be very noisy. Early stopping, regularization, and bayesian logistic regression, avoid these issues. Regardless, a popular but misguided method for analyzing the information consequences of correlations is to compute the information content of a population which has the same first order statistics and autocorrelations, but no cross-correlations between units. Practically speaking, this is accomplished by shuffling your data across trials. More generally, one can compute the information content contained in the product of the marginal distributions of each unit in the vector r. The difference between the information content of the true distribution and the information content of the shuffled distribution is labeled delta I_shuffled and is often used as a proxy for measuring the information consequences of correlations. Problems with this metric are discussed here. These criticisms can be summarized by noting that I_shuffled represents the information content associated with a code that simply does not exist in cortex. Wu and Amari understood this when they wrote their humorously (unintentionally?) titled Unfaithful Model... paper. In that work, they correctly pointed out that the presence or absence of correlations is simply a fact given by the data. What matters is whether or not the decoder of that activity which is implemented by cortex is capable of properly taking into account those correlations. To address this issue they constructed the so called I_diag metric which computes the information (fisher in this case) associated with a particular suboptimal decoder, specificlaly, the one which assumes statistical independence between neurons. For linear Fisher I_diag takes the simple form: Here$\Sigma_{diag}\$ is the covariance matrix with all the off diagonals removed. For contrast, I_shuffled takes the form:

Delta I_shuffled = I-I_shuffled can be either positive or negative. On the other hand delta I_diag = I-I_diag is a purely positive quantity and truly represents information loss due to suboptimal decoding and and upper bound on information loss due to suboptimal computation. Moreover, as pointed out in Averbeck and friends, delta I_shuffled can be either positive or negative even when delta I_diag is zero and delta I_diag can be quite large even when delta I_shuffle is negative or zero. So not only is I_diag the behaviorally relevant measure of the consequences of correlations, its the only one which has a consistent interpretation (i.e. a bound on information loss). For those who prefer Shannon information, similar metrics were discussed by Latham and Nirenberg. In that work, inverse variance of stimulus decoder applied to neural activity(Fisher information) was replaced with KL divergence between the true and the parameterized posterior distribution of the stimulus given neural activity. In particular, they defined delta I_cor-dep to be the KL divergence between the true posterior and a posterior which is obtained from a generative model which models the joint distribution of responses given the stimulus as the product of the marginal distributions, i.e.

In contrast, the Shannon equivalent of delta I_shuffled is the synergy/redundancy metric which takes the form:

This metric also can be positive or negative. Moreover, as with the Fisher information equivalent, this metric cannot be related to KL divergence or information loss and, as such, is also not a behaviorally relevant.

Sunday, March 23, 2008

J. Bayesian Analysis

A nice free journal via JD

Behavioral Evidence for Bayesian Computation

Knill, D., and Kersten, D. (1991) Nature
Weiss, Simoncelli, Adelson (2002). Nat Neurosci
Kersten, D., Mamassian, P., Yuille, A. (2004) Ann Rev Psych

Knill, (1998). Vision Research
Jacobs, (1999). Vision Research
Ernst, Banks (2002). Nature

Wolpert, Ghahramani, Jordan (1995). Nature
Todorov, Jordan (2002). Nature Neuroscience
Kording, Wolpert (2004). Nature

Bayes, Laplace, Bayesian and neo-Bayesians

More evidence that Laplace was the father (or perhaps the dedicated single mother) of Bayesian inference can be found here.

Contains this little jem: ...the term Bayesian "was first used in print by R.A. Fisher in the 1950 introduction to his 1930 paper on fiducial inference entitled Inverse Probability [where in he seeks to] 'distinguish [his result] from the Bayesian probability a posteriori.'"

I also enjoyed the many references to Stigler's Law which states that: "no scientific discovery is named after its original discoverer." The author also notes that "...it is worth noting that Stigler proposed [this Law] in the spirit of a self-proving theorem," but then neglects to mention from whom it was stolen...

Also contains an interesting aside on objective vs subjective probability. I particularly, like the comment indicating that many philosophers disliked the notion of subjective probability, but ultimately the sheer utility of the concept won the day. Anyway, the definition of objective probability likens it to a frequentist situation where pr(H) = 1/2 every time i toss a fair coin. This theory can be tested by throwing many coin tosses and showing that the average number of heads converges to 1/2. So its objective in the sense that the number 1/2 refers to a quantity which can actually be observed.

On the other hand the typical exemplar of a subjective probability is a statement of belief about what the weather will be like tomorrow. The distinction being that since tomorrow only comes once there is no way to verify that, on this particular day, there is a 50% chance of rain that is similar to the method used for the coin toss.

However, it seems to me that either this example or this distinction is rather quite poorly thought out. The repeatability and inter-toss independence of a coin is itself an assumption which should be subject to verification. This is no different than questioning the reasonability of comparing the clouds (or temp/pressure/etc) of today with those of yesterday. True we've got alot more evidence about coins, but lets face it, even after a billion coin tosses, we can neither conclude for certain that we have a fair one, or even that we have been tossing the "same" coin all this time.

This lead me to think that this issue pertains to the debate, brought to my attention by JJ, concerning whether or not a rational doxastic state (fancy word for belief) could have probability one. To my mind the resolution of that issue was that only conditional probabilities could take on probability 1. Such conditionals are syllogisms. In the context of this discussion, I would suggest that conditional probabilities which represent statements of model assumptions are objective. While probabilistic statements concerning empirical quantities are subjective. This is because additional evidence or even the consideration of additional models can augment degrees of beliefs regarding empirical quantities, while it is true by assumption that if we have a fair coin it will come up 50% heads if we toss it an infinite number of times. This is true regardless of how many times we actually toss it.

Inaugural Post

It seemed appropriate to describe my likely to be unrealized intentions for this blog. Firstly, it became apparent recently that my life required more order and focus than it has had in recent years. So, I began an experiment with gCalendar and a PDA (actually it's a hacked Ipod touch which is not a toy :). This was sufficiently successful that I am now seeking ways to better organize my thoughts. My current strategy has been to utilize a hierarchical file structure which groups projects according to the tree structure:

topic

• collaborators

• background papers
• code

• data

• data analysis and figures
• papers

• drafts
• supplements
• mathematical proofs

This works well enough for the projects which have gotten sufficiently off the ground, but has been a miserable failure for storing, sorting and otherwise tracking less well developed ideas or little bits of potentially useful and interesting information. Of course, privately blogging might be sufficient to deal with this issue, so why make this public? That pertains to my second issue: I truly despise the writing phase of scientific research. It is my hope that the practice and feedback that a blog can provide will sharpen both my ideas and my communication and presentation skills.

Finally, I should say that I quite enjoy a good debate/discussion concerning the fine details of mathematical proofs and empirical inference and hope that this blog will provide a forum for such discourse. I intend to share links with friends and collaborators and hope to write both very technical entries regarding research (including useful analytic and statistical methods and reviews of recent publications), and more general interest entries regarding philosophy of science should be expected as well.

That said, we live in an unpredictable world, and people are often the worst predictors of their own behavior...so, we shall see what we shall see.