SunJackson Blog

R Packages worth a look

转载自：https://analytixon.com/2018/11/25/r-packages-worth-a-look-1347/

Michael Laux

发表于 2018-11-24

Longitudinal Concordance Correlation (lcc)Estimates the longitudinal concordance correlation to access the longitudinal agreement profile. The estimation approach implemented is variance compon …

阅读全文 »

The evolution of pace in popular movies

转载自：https://andrewgelman.com/2018/11/24/evolution-pace-popular-movies/

Andrew

发表于 2018-11-24

James Cutting writes:

阅读全文 »

EARL conference recap： Seattle 2018

转载自：http://feedproxy.google.com/~r/RBloggers/~3/LmCuXa4_EPU/

Martin Monkman

发表于 2018-11-24

I had the pleasure of attending the EARL (Enterprise Applications of the R Language) Conference held in Seattle on 2018-11-07, and the honour of being one of the speakers. The EARL conferences occupy a unique niche in the R conference universe, bringing together the I-use-it-at-work contingent of the R community. The Seattle event was, from my perspective (I use R at work, and lead a team of data scientists that uses R) a fantastic conference. Full marks to the folks from Mango Solutions for organizing it!

阅读全文 »

Document worth reading： “Learning From Positive and Unlabeled Data： A Survey”

转载自：https://analytixon.com/2018/11/23/document-worth-reading-learning-from-positive-and-unlabeled-data-a-survey/

Michael Laux

发表于 2018-11-23

Learning from positive and unlabeled data or PU learning is the setting where a learner only has access to positive examples and unlabeled data. The assumption is that the unlabeled data can contain both positive and negative examples. This setting has attracted increasing interest within the machine learning literature as this type of data naturally arises in applications such as medical diagnosis and knowledge base completion. This article provides a survey of the current state of the art in PU learning. It proposes seven key research questions that commonly arise in this field and provides a broad overview of how the field has tried to address them. Learning From Positive and Unlabeled Data: A Survey

阅读全文 »

R Packages worth a look

转载自：https://analytixon.com/2018/11/23/r-packages-worth-a-look-1345/

Michael Laux

发表于 2018-11-23

Lindley Power Series Distribution (LindleyPowerSeries)Computes the probability density function, the cumulative distribution function, the hazard rate function, the quantile function and random generation …

阅读全文 »

Magister Dixit

转载自：https://analytixon.com/2018/11/23/magister-dixit-1419/

Michael Laux

发表于 2018-11-23

“Data science, surprisingly perhaps, is not about designing the most advanced machine learning algorithms and training them on all of the data (and then having Skynet). It’s about finding the right data, becoming a quasi-expert on the process, system, or event you are trying to model, and crafting features that will help quirky and sometimes frail statistical algorithms make accurate predictions. Very little time is actually spent on the algorithm itself.” Scott W. Strong ( April 10, 2018 )

阅读全文 »

Top 5 domains Big Data analytics helps to transform

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/tsWbMOr6TxE/top-5-domains-big-data-analytics.html

Dan Clark

发表于 2018-11-23

By Tetiana Boichenko, n-ix.com.

阅读全文 »

If you did not already know

转载自：https://analytixon.com/2018/11/23/if-you-did-not-already-know-554/

Michael Laux

发表于 2018-11-23

Data Lineage Analysis “Data lineage is defined as a data life cycle that includes the data’s origins and where it moves over time.” It describes what happens to data as it goes through diverse processes. It helps provide visibility into the analytics pipeline and simplifies tracing errors back to their sources. It also enables replaying specific portions or inputs of the dataflow for step-wise debugging or regenerating lost output. In fact, database systems have used such information, called data provenance, to address similar validation and debugging challenges already.Data provenance documents the inputs, entities, systems, and processes that influence data of interest, in effect providing a historical record of the data and its origins. The generated evidence supports essential forensic activities such as data-dependency analysis, error/compromise detection and recovery, and auditing and compliance analysis. “Lineage is a simple type of why provenance.” …

阅读全文 »

If you did not already know

转载自：https://analytixon.com/2018/11/24/if-you-did-not-already-know-555/

Michael Laux

发表于 2018-11-23

Double Path Networks for Sequence to Sequence Learning (DPN-S2S) Encoder-decoder based Sequence to Sequence learning (S2S) has made remarkable progress in recent years. Different network architectures have been used in the encoder/decoder. Among them, Convolutional Neural Networks (CNN) and Self Attention Networks (SAN) are the prominent ones. The two architectures achieve similar performances but use very different ways to encode and decode context: CNN use convolutional layers to focus on the local connectivity of the sequence, while SAN uses self-attention layers to focus on global semantics. In this work we propose Double Path Networks for Sequence to Sequence learning (DPN-S2S), which leverage the advantages of both models by using double path information fusion. During the encoding step, we develop a double path architecture to maintain the information coming from different paths with convolutional layers and self-attention layers separately. To effectively use the encoded context, we develop a cross attention module with gating and use it to automatically pick up the information needed during the decoding step. By deeply integrating the two paths with cross attention, both types of information are combined and well exploited. Experiments show that our proposed method can significantly improve the performance of sequence to sequence learning over state-of-the-art systems. …

阅读全文 »

Counting digits by @ellis2013nz

转载自：http://feedproxy.google.com/~r/RBloggers/~3/YAyPosdDtzA/

free range statistics - R

发表于 2018-11-23

Counting digits appearing in page numbers The other day in a training session, the facilitators warmed people up into intellectual work with this group exercise:

阅读全文 »