SunJackson Blog

Miami University： Director of the Center for Analytics & Data Science (CADS) [Oxford, OH]

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/r_FswqaOnYU/12-19-miami-university-director-cads.html

Matt Mayo Editor

发表于 2018-12-20

At: Miami University Location: Oxford, OHWeb: www.miami.miamioh.eduPosition: Director of the Center for Analytics & Data Science (CADS)

阅读全文 »

Distilled News

转载自：https://analytixon.com/2018/12/21/distilled-news-938/

Michael Laux

发表于 2018-12-20

How Kubernetes Platform Works at the Fundamental Level

阅读全文 »

Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model

转载自：https://andrewgelman.com/2018/12/20/exploring-model-fit-looking-histogram-posterior-simulation-draw-set-parameters-hierarchical-model/

Andrew

发表于 2018-12-20

Opher Donchin writes in with a question:

阅读全文 »

Whats new on arXiv

转载自：https://analytixon.com/2018/12/20/whats-new-on-arxiv-848/

Michael Laux

发表于 2018-12-20

Distill-Net: Application-Specific Distillation of Deep Convolutional Neural Networks for Resource-Constrained IoT Platforms

阅读全文 »

Document worth reading： “A second-quantised Shannon theory”

转载自：https://analytixon.com/2018/12/20/document-worth-reading-a-second-quantised-shannon-theory/

Michael Laux

发表于 2018-12-20

Shannon’s theory of information was built on the assumption that the information carriers were classical systems. Its quantum counterpart, quantum Shannon theory, explores the new possibilities that arise when the information carriers are quantum particles. Traditionally,quantum Shannon theory has focussed on scenarios where the internal state of the particles is quantum, while their trajectory in spacetime is classical. Here we propose a second level of quantisation where both the information and its propagation in spacetime is treated quantum mechanically. The framework is illustrated with a number of examples, showcasing some of the couterintuitive phenomena taking place when information travels in a superposition of paths. A second-quantised Shannon theory

阅读全文 »

Amazon SageMaker adds Scikit-Learn support

转载自：https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-adds-scikit-learn-support/

Laurence Rouesnel

发表于 2018-12-20

Amazon SageMaker now comes pre-configured with the Scikit-Learn machine learning library in a Docker container. Scikit-Learn is popular choice for data scientists and developers because it provides efficient tools for data analysis and high quality implementations of popular machine learning algorithms through a consistent Python interface and well documented APIs. Scikit-Learn executes quickly and can scale to most data sets and problems, making it an ideal choice when you need to iterate quickly on your machine learning problems. Unlike Deep Learning frameworks such as TensorFlow or MxNet, Scikit-Learn is used for machine learning and data analysis. You can select from a range of supervised and unsupervised learning algorithms for clustering, regression, classification, dimensionality reduction, feature preprocessing, and model selection.

阅读全文 »

Whats new on arXiv

转载自：https://analytixon.com/2018/12/20/whats-new-on-arxiv-849/

Michael Laux

发表于 2018-12-20

Mining Interpretable AOG Representations from Convolutional Networks via Active Question Answering

阅读全文 »

Easily train models using datasets labeled by Amazon SageMaker Ground Truth

转载自：https://aws.amazon.com/blogs/machine-learning/easily-train-models-using-datasets-labeled-by-amazon-sagemaker-ground-truth/

Sumit Thakur

发表于 2018-12-20

Data scientists and developers can now easily train machine learning models on datasets labeled by Amazon SageMaker Ground Truth. Amazon SageMaker Training now accepts the labeled datasets produced in augmented manifest format as input through both AWS Management Console and Amazon SageMaker Python SDK APIs.

阅读全文 »

Day 20 – little helper char_replace

转载自：http://feedproxy.google.com/~r/RBloggers/~3/vEoun1gSnRs/

Jakob Gepp

发表于 2018-12-20

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it will be much easier to share and improve those functions, if they are within a package. Up till the 24th December I will present one function each day from helfRlein. So, on the 20th day of Christmas my true love gave to me…

阅读全文 »

If you did not already know

转载自：https://analytixon.com/2018/12/20/if-you-did-not-already-know-584/

Michael Laux

发表于 2018-12-20

Helix Machine learning workflow development is a process of trial-and-error: developers iterate on workflows by testing out small modifications until the desired accuracy is achieved. Unfortunately, existing machine learning systems focus narrowly on model training—a small fraction of the overall development time—and neglect to address iterative development. We propose Helix, a machine learning system that optimizes the execution across iterations—intelligently caching and reusing, or recomputing intermediates as appropriate. Helix captures a wide variety of application needs within its Scala DSL, with succinct syntax defining unified processes for data preprocessing, model specification, and learning. We demonstrate that the reuse problem can be cast as a Max-Flow problem, while the caching problem is NP-Hard. We develop effective lightweight heuristics for the latter. Empirical evaluation shows that Helix is not only able to handle a wide variety of use cases in one unified workflow but also much faster, providing run time reductions of up to 19x over state-of-the-art systems, such as DeepDive or KeystoneML, on four real-world applications in natural language processing, computer vision, social and natural sciences. …

阅读全文 »