We’re excited to announce the winners of the 2018 AWS AI Hackathon. Horacio Canales has won first place with his “Second Alert” project. This project enables users from around the world to identify missing persons, including human trafficking victims, children too young to remember their family members’ names, and mentally handicapped individuals. Horacio built the solution using image analysis, text analysis, and conversational agents with Amazon Rekognition, Amazon Comprehend, and Amazon Lex. In recognition for his contribution, Horacio will receive $5,000 USD and $2,500 in AWS Credits.
Gender Diversity in the R and Python Communities
Many (if not most) tech communities have far more representation from men than from women (and even fewer from nonbinary folk). This is a shame, because everybody uses software, and these projects would self-evidently benefit from the talent and expertise from across the entire community. Some projects are doing better than others, though, and data scientist Reshama Shaikh recently published an in-depth comparison of the representation of women in the R any Python communities.
How to build a data science project from scratch
By Jekaterina Kokatjuhha, Research Engineer at Zalando.
Niall Ferguson and the perils of playing to your audience
History professor Niall Ferguson had another case of the sillies.
“Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm)
7pm in Fayerweather 310:
If you did not already know
Linear Additive Markov Process (LAMP) We introduce LAMP: the Linear Additive Markov Process. Transitions in LAMP may be influenced by states visited in the distant history of the process, but unlike higher-order Markov processes, LAMP retains an efficient parametrization. LAMP also allows the specific dependence on history to be learned efficiently from data. We characterize some theoretical properties of LAMP, including its steady-state and mixing time. We then give an algorithm based on alternating minimization to learn LAMP models from data. Finally, we perform a series of real-world experiments to show that LAMP is more powerful than first-order Markov processes, and even holds its own against deep sequential models (LSTMs) with a negligible increase in parameter complexity. …
rnoaa: new data sources and NCDC units
rOpenSci - open tools for open science
发表于
We’ve just released a new version of rnoaa with A LOT of changes. Check outthe release notesfor a complete list of changes.
Heatmaps of Mortality Rates
As part of the run-up to the release of Data Visualization (out in about ten days! Currently 30% off on Amazon!), I’ve been playing with graphing different kinds of data. One great source of rich time-series data is mortality.org, which hosts a collection of standardized demographic data for a large number of countries. Mortality rates are often interesting to look at as a heatmap, as we get data for a series of ages (e.g., mortality rates) over some time period. If we make a grid with time on the x-axis and age on the y-axis, we can fill in the boxes with a color representing the rate. A side-effect of that sort of representation is that the diagonals track age cohorts from the bottom left to the upper right of the graph, as people age at the rate of one year per year.
R Packages worth a look
A Robust and Powerful Test of Abnormal Stock Returns in Long-Horizon Event Studies (crseEventStudy)Based on Dutta et al. (2018) <doi:10.1016/j.jempfin.2018.02.004>, this package provides their standardized test for abnormal returns in long-hori …
Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: December and Beyond
You can also find the latest list on