Document worth reading: “Saliency Prediction in the Deep Learning Era: An Empirical Investigation”
Visual saliency models have enjoyed a big leap in performance in recent years, thanks to advances in deep learning and large scale annotated data. Despite enormous effort and huge breakthroughs, however, models still fall short in reaching human-level accuracy. In this work, I explore the landscape of the field emphasizing on new deep saliency models, benchmarks, and datasets. A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets. Further, I identify factors that contribute to the gap between models and humans and discuss remaining issues that need to be addressed to build the next generation of more powerful saliency models. Some specific questions that are addressed include: in what ways current models fail, how to remedy them, what can be learned from cognitive studies of attention, how explicit saliency judgments relate to fixations, how to conduct fair model comparison, and what are the emerging applications of saliency models. Saliency Prediction in the Deep Learning Era: An Empirical Investigation
Distilled News
Prescriptive Maintenance for Manufacturing Industry
“On the Diagramatic Diagnosis of Data” at BudapestBI 2018
A couple of days back I spoke on using diagrams (matplotlib, seaborn, pandas profiling) to diagnose data during the exploratory data analysis phase. I also introduced my new tool discover_feature_relationships which helps prioritise which features to investigate in a new dataset by identifying pairs of features that have some sort of ‘interesting’ relationship. We finished with a short note on Bertil’s ‘data story‘ concept for documenting the EDA process.
Top 10 Python Data Science Libraries
Python continues to lead the way when it comes to Machine Learning, AI, Deep Learning and Data Science tasks. According to builtwith.com, 45% of technology companies prefer to use Python for implementing AI and Machine Learning.
Because it's Friday: The physics of The Expanse
For a science fiction show set hundreds of years in the future, The Expanse is unusual in that it takes very few liberties with Science as we understand it today. The solar system is made up of the familiar planets we know (other than the colonists and space stations spread throughout the system), communication is limited by the speed of light, and and spaceships operate under standard Newtonian principles. You can get a good sense in this highlight reel of the visual effects of the show:
Hey, check this out: Columbia’s Data Science Institute is hiring research scientists and postdocs!
The Institute’s Postdoctoral and Research Scientists will help anchor Columbia’s presence as a leader in data-science research and applications and serve as resident experts in fostering collaborations with the world-class faculty across all schools at Columbia University. They will also help guide, plan and execute data-science research, applications and technological innovations that address societal challenges and related University-wide initiatives.
Whats new on arXiv
An Introductory Survey on Attention Mechanisms in NLP Problems
R Packages worth a look
Header-Only C++ Mathematical Optimization Library for ‘Armadillo’ (RcppEnsmallen)Ensmallen’ is a templated C++ mathematical optimization library (by the ‘MLPACK’ team) that provides a simple set of abstractions for writing an object …
Example of Overfitting
I occasionally see queries on various social media as to overfitting — what is it?, etc. I’ll post an example here. (I mentioned it at my talk the other night on our novel approach to missing values, but had a bug in the code. Here is the correct account.)