What is Content Services?
R Packages worth a look
Read, Validate, Analyze, and Map Files in the General Transit Feed Specification (tidytransit)Read General Transit Feed Specification (GTFS) zipfiles into a list of R dataframes. Perform validation of the data structure against the specification …
Discussion of the value of a mathematical model for the dissemination of propaganda
A couple people pointed me to this article, “How to Beat Science and Influence People: Policy Makers and Propaganda in Epistemic Networks,” by James Weatherall, Cailin O’Connor, and Justin Bruner, also featured in this news article. Their paper begins:
If you did not already know
Jubatus
Jubatus is a distributed processing framework and streaming machine learning library. Jubatus includes these functionalities:· Online Machine Learning Library: Classification, Regression, Recommendation (Nearest Neighbor Search), Graph Mining, Anomaly Detection, Clustering· Feature Vector Converter (fv_converter): Data Preprocess and Feature Extraction· Framework for Distributed Online Machine Learning with Fault Tolerance …
LSTM的神奇之处
前言
机器学习面试
主题模型
GBM
GBM(gradient boosting machine)
Jeremy Freese was ahead of the curve
Here’s sociologist Jeremy Freese writing, back in 2008:
Document worth reading: “Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems”
Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what looks like unrelated features can teach us about each other. Confronted with this challenge, in this paper we open a new line of research, where the security, privacy, and fairness is learned and used in a closed environment. The goal is to ensure that a given entity (e.g., the company or the government), trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the gender of the subject (negative task). Similarly, a company can guarantee that internally it is not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to succeed on the positive task while simultaneously fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one being blocked. Fairness, to the information in the negative task, is often automatically obtained as a result of this proposed approach. The particular framework and examples open the door to security, privacy, and fairness in very important closed scenarios, ranging from private data accumulation companies like social networks to law-enforcement and hospitals. Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems
Create video subtitles with translation using machine learning
Businesses from around the globe require fast and reliable ways to transcribe an audio or video file, and often in multiple languages. This audio and video content can range from a news broadcast, call center phone interactions, a job interview, a product demonstration, or even court proceedings. The traditional process for transcription is both expensive and lengthy, often involving the hiring of dedicated staff or services, with a high degree of manual effort. This effort is compounded when a multi-language transcript is required, often leaving customers to over-dub the original content with a new audio track.