Representative Approach
We propose a fast and efficient strategy, called the representative approach, for big data analysis with linear models and generalized linear models. With a given partition of big dataset, this approach constructs a representative data point for each data block and fits the target model using the representative dataset. In terms of time complexity, it is as fast as the subsampling approaches in the literature. As for efficiency, its accuracy in estimating parameters is better than the divide-and-conquer method. With comprehensive simulation studies and theoretical justifications, we recommend two representative approaches. For linear models or generalized linear models with a flat inverse link function and moderate coefficients of continuous variables, we recommend mean representatives (MR). For other cases, we recommend score-matching representatives (SMR). As an illustrative application to the Airline on-time performance data, MR and SMR are as good as the full data estimate when available. Furthermore, the proposed representative strategy is ideal for analyzing massive data dispersed over a network of interconnected computers. …
Gold-Mining Week 12 (2018)
The post Gold-Mining Week 12 (2018) appeared first on Fantasy Football Analytics.
Dealing with failed projects
Recently, I came up with Thoen’s law. It is an empirical one, based on several years of doing data science projects in different organisations. Here it is: The probability that you have worked on a data science project that failed, approaches one very quickly as the number of projects done grows. I think many, far more than we as a community like to admit, will deal with projects that don’t meet their objectives. This blog does not explore why data science projects have a high risk of failing. Jonathan Nolis already did this adequately. Rather, I’ll look for strategies how we might deal with projects that are failing. Disappointing as they may be, failed projects are inherently part of the novel and challenging discipline data science is in many organisations. The following approach might reduce the probability of failure, but that is not the main point. Their objective is to prevent failing in silence after too long a period of project time. In which you try to figure out things on your own. They will shift failure from the silent personal domain to the public collective one. Hopefully, reducing stress and blame by yourself and others.
“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.”
Robert Wiblin writes:
KNNs (K-Nearest-Neighbours) in Python
The Nearest Neighbours algorithm is an optimization problem that was initially formulated in tech literature by Donald Knuth. The key behind the idea was to find out into which group of classes a random point in the search space belongs to, in a binary class, multiclass, continuous. unsupervised, or semi-supervised algorithm. Sounds mathematical? Let’s make it simple.
Monash University: Lecturer/Sr Lecturer – Digital Health [Melbourne, Australia]
At: Monash UniversityLocation: Melbourne, Australia
Web: www.monash.eduPosition: Lecturer/Senior Lecturer - Digital Health
6 Goals Every Wannabe Data Scientist Should Make for 2019
By Kayla Matthews, Productivity Bytes
Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities
Today we are announcing updates to our face detection, analysis, and recognition features. These updates provide customers with improvements in the ability to detect more faces from images, perform higher accuracy face matches, and obtain improved age, gender, and emotion attributes for faces in images. Amazon Rekognition customers can use each of these enhancements starting today, at no additional cost. No machine learning experience is required.
OpenCPU 2.1 Release: Scalable R Services
Monash University: Research Fellow (Digital Civics) [Melbourne, Australia]
At: Monash UniversityLocation: Melbourne, Australia
Web: www.monash.eduPosition: Research Fellow (Digital Civics)