By Adolfo Martínez, Data Scientist at datank.ai.
When the numbers don’t tell the whole story
Anscombe’s Quartet is a famous collection of four small data sets — just 11 (x,y) pairs each — that was developed in the 1970s to emphasize the fact that sometimes, numerical summaries of data aren’t enough. (For a modern take on this idea, see also the Datasaurus Dozen.) In this case, it takes visualizing the data to realize that the for data sets are qualitatively very different, even though the means, variances, and regression coefficients are all the same. In the video below for Guy in a Cube, Buck Woody uses R to summarize the data (which is conveniently built into R) and visualize it using an R script in Power BI.
SiliconANGLE: Machine learning automation startup DataRobot lands $100M round
Well-funded machine learning startup DataRobot Inc. has just added $100 million more to its war chest via a late-stage round of funding led by Meritech and Sapphire Ventures.
A study fails to replicate, but it continues to get referenced as if it had no problems. Communication channels are blocked.
In 2005, Michael Kosfeld, Markus Heinrichs, Paul Zak, Urs Fischbacher, and Ernst Fehr published a paper, “Oxytocin increases trust in humans.” According to Google, that paper has been cited 3389 times.
U. of Zurich: Professorship in Big Data Science (Open Rank) [Zurich, Switzerland]
At: U. of Zurich
Location: Zurich, SwitzerlandWeb: www.ifi.uzh.ch/en.html
Join us at the EARL US Roadshow – a conference dedicated to the real-world usage of R
Join us at the EARL US Roadshow – a conference dedicated to the real-world usage of R EARL, the Enterprise Applications of the R Language Conference is set to embark on a US roadshow following a successful London conference in September, with dates in Seattle, Houston and Boston between 7th and 13th November.
When the numbers don't tell the whole story
Anscombe’s Quartet is a famous collection of four small data sets — just 11 (x,y) pairs each — that was developed in the 1970s to emphasize the fact that sometimes, numerical summaries of data aren’t enough. (For a modern take on this idea, see also the Datasaurus Dozen.) In this case, it takes visualizing the data to realize that the for data sets are qualitatively very different, even though the means, variances, and regression coefficients are all the same. In the video below for Guy in a Cube, Buck Woody uses R to summarize the data (which is conveniently built into R) and visualize it using an R script in Power BI.
Generative Adversarial Networks – Paper Reading Road Map
By İdil Sülo, Middle East Technical University
Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options
An Amazon SageMaker notebook instance provides a Jupyter notebook app through a fully managed machine learning (ML) Amazon EC2 instance. Amazon SageMaker Jupyter notebooks are used to perform advanced data exploration, create training jobs, deploy models to Amazon SageMaker hosting, and test or validate your models.
If you did not already know
Sounding Board
We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize. The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. We also share insights gained from large-scale online logs based on 160,000 conversations with real-world users. …