In this post we’re going to model the prices of Airbnb appartments in London. In other words, the aim is to build our own price suggestion model. We will be using data from http://insideairbnb.com/ which we collected in April 2018. This work is inspired from the Airbnb price prediction model built by Dino Rodriguez, Chase Davis, and Ayomide Opeyemi. Normally we would be doing this in R but we thought we’d try our hand at Python for a change.
Because it's Friday: Hey, it's Enrico Pallazzo!
It seemed like such a simple movie. The Naked Gun (1988) is slapstick comedy through-and-through, but I never would have guessed (h/t Steven O’Grady) how much detail and planning went into the jokes, especially the baseball scene at the end. There’s lots of interesting behind-the-scenes info in Sporting News’s breakdown of the movie. Even Drebin’s bungled National Anthem performance was composed in advance for the scene:
Hitchhiker's guide to Exploratory Data Analysis
How to investigate a dataset with python?
Is it time to stop using sentinel values for null / "NA" values?
** Fri 12 October 2018
Sketchnotes from TWiML&AI: Evaluating Model Explainability Methods with Sara Hooker
These are my sketchnotes for Sam Charrington’s podcast This Week in Machine Learning and AI about Evaluating Model Explainability Methods with Sara Hooker:
The Economist's Big Mac Index is calculated with R
The Economist’s Big Mac Index (also described on Wikipedia if you’re not a subscriber) was created (somewhat tongue-in-cheek) as a measure to compare the purchasing power of money in different countries. Since Big Macs are available just about everywhere in the world, the price of a Big Mac in Sweden — expressed in US dollars — gives an American traveler a sense of how much more expensive things will be in Stockholm. And comparing the price of a Big Mac in several countries converted to a single baseline currency is a measure of how over-valued (or undervalued) those other currencies are compared to that baseline.
Distilled News
Causality and graphical methods
Stan on the web! (thanks to RStudio)
So you can get started on Stan without any investment in set-up time, no need to install C++ on your computer, etc.
R Packages worth a look
Revealed Preferences and Microeconomic Rationality (revealedPrefs)Computation of (direct and indirect) revealed preferences, fast non-parametric tests of rationality axioms (WARP, SARP, GARP), simulation of axiom-cons …
If you did not already know
Open Domain INformer (ODIN)
Rule-base information extraction (IE) has long enjoyed wide adoption throughout industry, though it has remained largely ignored in academia, in favor of machine learning (ML) methods (Chiticariu et al., 2013). However, rule-based systems have several advantages over pure ML systems, including: (a) the rules are interpretable and thus suitable for rapid development and/or domain transfer; and (b) humans and machines can contribute to the same model. Why then have such systems failed to hold the attention of the academic community? One argument raised by Chiticariu et al. is that, despite notable previous efforts (Appelt and Onyshkevych, 1998; Levy and Andrew, 2006; Hunter et al., 2008; Cunningham et al., 2011; Chang and Manning, 2014), there is not a standard language for this task, or a ‘standard way to express rules’, which raises the entry cost for new rule-based systems. ODIN aims to address these issues with a new language and framework. We follow the simplicity principles promoted by other natural language processing toolkits, such as Stanford’s CoreNLP, which aim to ‘avoid over-design’, ‘do one thing well’, and have a user ‘up and running in ten minutes or less’ (Manning et al., 2014). …