SunJackson Blog


  • 首页

  • 分类

  • 关于

  • 归档

  • 标签

  • 站点地图

  • 公益404

“Simulations are not scalable but theory is scalable”

转载自:https://andrewgelman.com/2018/11/02/simulations-not-scalable-theory-scalable/

Andrew


发表于 2018-11-02

I just watched this video the value of theory inapplied fields (like statistics), it really resonated with my previous research experiences in statistical physics and on the interplay between randomised perfect sampling algorithms and Markov Chain mixing as well as my current perspective on the status quo of deep learning. . . .

阅读全文 »

Data Science “Paint by the Numbers” with the Hypothesis Development Canvas

转载自:http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/p3xZj8XBunE/data-science-paint-by-numbers-hypothesis-development-canvas.html

William Schmarzo


发表于 2018-11-02

When I was a kid, I use to love “Paint by the Numbers” sets.  Makes anyone who can paint or color between the lines a Rembrandt or Leonardo da Vinci (we can talk later about the long-term impact of forcing kids to “stay between the lines”).

阅读全文 »

Data Representation for Natural Language Processing Tasks

转载自:http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/1Z_eQ9_IT74/data-representation-natural-language-processing.html

Matthew Mayo


发表于 2018-11-02

We have previously had a long look at a number of introductory natural language processing (NLP) topics, from approaching such tasks, to preprocessing text data, to getting started with a pair of popular Python libraries, and beyond. I was hoping to move on to exploring some different types of NLP tasks, but had it pointed out to me that I had neglected to touch on a hugely important aspect: data representation for natural language processing.

阅读全文 »

Quick overview on the new Bioconductor 3.8 release

转载自:http://feedproxy.google.com/~r/RBloggers/~3/XWhyaM_-4TE/

Rstats on LIBD rstats club


发表于 2018-11-02

Every six months the Bioconductor project releases it’s new version of packages. This allows developers a time window to try out new methods and test them rigorously before releasing them to the community at large. It also means that this is an exciting time �. With every release there are dozens of new software packages. Bioconductor version 3.8 was just released on Halloween: October 31st, 2018. Thus, this is the perfect time to browse through their descriptions and find out what’s new that can be of use to your research.

阅读全文 »

Document worth reading: “Transfer Metric Learning: Algorithms, Applications and Outlooks”

转载自:https://advanceddataanalytics.net/2018/11/02/document-worth-reading-transfer-metric-learning-algorithms-applications-and-outlooks/

Michael Laux


发表于 2018-11-02

Distance metric learning (DML) aims to find an appropriate way to reveal the underlying data relationship. It is critical in many machine learning, pattern recognition and data mining algorithms, and usually require large amount of label information (class labels or pair/triplet constraints) to achieve satisfactory performance. However, the label information may be insufficient in real-world applications due to the high-labeling cost, and DML may fail in this case. Transfer metric learning (TML) is able to mitigate this issue for DML in the domain of interest (target domain) by leveraging knowledge/information from other related domains (source domains). Although achieved a certain level of development, TML has limited success in various aspects such as selective transfer, theoretical understanding, handling complex data, big data and extreme cases. In this survey, we present a systematic review of the TML literature. In particular, we group TML into different categories according to different settings and metric transfer strategies, such as direct metric approximation, subspace approximation, distance approximation, and distribution approximation. A summarization and insightful discussion of the various TML approaches and their applications will be presented. Finally, we provide some challenges and possible future directions. Transfer Metric Learning: Algorithms, Applications and Outlooks

阅读全文 »

The blocks and rows theory of data shaping

转载自:http://feedproxy.google.com/~r/RBloggers/~3/SZetoyh0Cu8/

John Mount


发表于 2018-11-02

We have our latest note on the theory of data wrangling up here. It discusses the roles of “block records” and “row records” in the cdata data transform tool. With that and the theory of how to design transforms, we think we have a pretty complete description of the system.

阅读全文 »

Whats new on arXiv

转载自:https://advanceddataanalytics.net/2018/11/02/whats-new-on-arxiv-803/

Michael Laux


发表于 2018-11-02

Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction

阅读全文 »

My two talks in Austria next week, on two of your favorite topics!

转载自:https://andrewgelman.com/2018/11/02/my-two-talks-in-austria-next-week/

Andrew


发表于 2018-11-02

Innsbruck, 7 Nov 2018:

阅读全文 »

Data Notes: Chinese Tourism's Impact on Taiwan

转载自:http://blog.kaggle.com/2018/11/01/data-notes-chinese-tourisms-impact-on-taiwan/

Paul Mooney


发表于 2018-11-01

Chinese tourism, US elections, and PyTorch: Enjoy these new, intriguing, and overlooked datasets and kernels

阅读全文 »

How Data Science Is Improving Higher Education

转载自:http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/qbtfsztg0no/data-science-improving-higher-education.html

Matt Mayo Editor


发表于 2018-11-01

By Kayla Matthews, Productivity Bytes

阅读全文 »
1 … 141 142 143 … 398
SunJackson

SunJackson

3974 日志
5 分类
© 2018 - 2019 SunJackson