Ken Novak's Weblog Purpose of this blog: to retain annotated bookmarks for my future reference, and to offer others my filter technology and other news. Note that this blog is categorized. Use the category links to find items that match your interests. Subscribe to get this blog by e-mail. New: Read what I'm reading on Bloglines. Collection of raw notes on Bayes implementations Future Now: Bayesian Nets: "A very good tutorial on Bayesian Nets, with lots of supporting information. Via the package Netica from Norsys. This methodology is becoming more common for delivering expertise. Because its statistically based it can model aspects of uncertainty in a system. The site has downloadable software for testing. This can be seen as a replacement for the 'rule bases' that we used for delivering expertise back in the 90s in expert systems. Here is another useful online tutorial." below mostly collected from Slashdot | Bayesian Filtering Outside of Email?
Statistical background: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval - Lewis (ResearchIndex): "The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assumptions made about word occurrences in documents" Mature scientific applications: MrBayes: "MrBayes is a program for the Bayesian estimation of phylogeny. [The evolutionary relationships among organisms; the patterns of lineage branching]. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees." BAMBE: "BAMBE Bayesian Analysis in Molecular Biology and Evolution" The BUGS Project: "BUGS is a program that carries out Bayesian inference on complex statistical problems for which there is no exact analytic solution, and for which even standard approximation techniques have difficulties. Conditional independence assumptions mean that it is often convenient to represent the essential structure of the problem as a graphical model. A Markov chain Monte Carlo (MCMC) approach to numerical integration is used: " |