SunJackson Blog

Twilio offers greater voice selection to customers with Amazon Polly integration

转载自：https://aws.amazon.com/blogs/machine-learning/twilio-offers-greater-voice-selection-to-customers-with-amazon-polly-integration/

Binny Peh

发表于 2018-08-06

By providing a scalable cloud platform for building communications experiences, Twilio enables developers and businesses to build any customer engagement into their applications using simple and powerful APIs for voice, messaging, and video. Businesses like Morgan Stanley, Marks & Spencer, Netflix, Lyft, Airbnb, and more than 50,000 others are modernizing the way they communicate with their customers using the Twilio platform.

阅读全文 »

Testing code with random output

转载自：https://www.codementor.io/strahinjalukic/testing-code-with-random-output-m2mmvqu0q

Strahinja Lukić

发表于 2018-08-06

Many real-world applications involve code with output that contains a certain degree of randomness. Consider, for example, code for the analysis of measurements. Measurements typically result in data randomly distributed according to some well-defined distribution. The task for the analysis code may then involve statistical analyses, the result of which is, again, randomly distributed according to some distribution.

阅读全文 »

Announcing the Amazon SageMaker MXNet 1.2 container

转载自：https://aws.amazon.com/blogs/machine-learning/announcing-the-amazon-sagemaker-mxnet-1-2-container/

David Arpin

发表于 2018-08-06

The Amazon SageMaker pre-built MXNet container now uses the latest release of Apache MXNet 1.2. Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. And the pre-built MXNet container makes it easy to write your deep learning scripts naturally but still take advantage of distributed, managed training and real-time production hosting in Amazon SageMaker.

阅读全文 »

Essential Tips and Tricks for Starting Machine Learning with Python

转载自：https://www.codementor.io/tirthajyotisarkar/essential-tips-and-tricks-for-starting-machine-learning-with-python-m4qi9n3er

Tirthajyoti Sarkar

发表于 2018-08-05

阅读全文 »

Scale out your Pandas DataFrame operations using Dask

转载自：https://www.data-blogger.com/2018/08/05/scale-out-your-pandas-dataframe-operations-using-dask/

Kevin Jacobs

发表于 2018-08-05

In Pandas, one can easily apply operations on all the data using the apply method. However, this method is quite slow and is not useful when scaling up your methods. Is there a way to speed up these operations? And if so, how? Yes, there is! This blog post will explain how you can use Dask to maximize the power of parallelization and to scale out your DataFrame operations.

阅读全文 »

Response to Rafa： Why I don’t think ROC [receiver operating characteristic] works as a model for science

转载自：http://andrewgelman.com/2018/08/05/response-rafael-irizarry-dont-think-roc-receiver-operating-characteristic-works-model-science/

Andrew

发表于 2018-08-05

Someone pointed me to this post from a few years ago where Rafael Irizarry argues that scientific “pessimists” such as myself are, at least in some fields, “missing a critical point: that in practice, there is an inverse relationship between increasing rates of true discoveries and decreasing rates of false discoveries and that true discoveries from fields such as the biomedical sciences provide an enormous benefit to society.” So far so good—within the framework in which the goal of p-value-style science is to make “discoveries” and in which these discoveries can be characterized as “true” or “false.”

阅读全文 »

Collecting Expressions in R

转载自：http://www.win-vector.com/blog/2018/08/collecting-expressions-in-r/

John Mount

发表于 2018-08-05

Not a full R article, but a quick note demonstrating by example the advantage of being able to collect many expressions and pack them into a single extend_se() node.

阅读全文 »

R Packages worth a look

转载自：https://advanceddataanalytics.net/2018/08/05/r-packages-worth-a-look-1233/

Michael Laux

发表于 2018-08-05

Optimally Robust Estimation for Extreme Value Distributions (RobExtremes)Optimally robust estimation for extreme value distributions using S4 classes and methods (based on packages ‘distr’, ‘distrEx’, ‘distrMod’, ‘RobAStBase …

阅读全文 »

If you did not already know

转载自：https://advanceddataanalytics.net/2018/08/05/if-you-did-not-already-know-443/

Michael Laux

发表于 2018-08-05

Random Subsampling Random subsampling, which is also known as Monte Carlo crossvalidation, as multiple holdout or as repeated evaluation set, is based on randomly splitting the data into subsets, whereby the size of the subsets is defined by the user. The random partitioning of the data can be repeated arbitrarily often. In contrast to a full crossvalidation procedure, random subsampling has been shown to be asymptotically consistent resulting in more pessimistic predictions of the test data compared with crossvalidation. The predictions of the test data give a realistic estimation of the predictions of external validation data . …

阅读全文 »

Magister Dixit

转载自：https://advanceddataanalytics.net/2018/08/04/magister-dixit-1305/

Michael Laux

发表于 2018-08-04

“What is a Prediction Problem? A business problem which involves predicting future events by extracting patterns in the historical data. Prediction problems are solved using Statistical techniques, mathematical models or machine learning techniques. For example: Forecasting stock price for the next week, predicting which football team wins the world cup, etc.” Suresh Kumar Gorakala ( December 26, 2014 )

阅读全文 »