Ya Xu

Ya Xu

Mountain View, California, United States
23K followers 500+ connections

About

I'm passionate about bridging science and engineering to create impactful results.

Experience

Education

  •  Graphic

    -

    -

    My thesis was on semi-supervised learning on graphs. In semi-supervised
    learning on graphs, features observed at one node are used to estimate missing values
    at other nodes. Many prediction methods have been proposed in the machine learning
    community over the past few years. In this thesis we show that several such proposals
    are equivalent to kriging predictors based on a fixed covariance matrix driven by the
    link structure of the graph. We then propose a data-driven estimator of…

    My thesis was on semi-supervised learning on graphs. In semi-supervised
    learning on graphs, features observed at one node are used to estimate missing values
    at other nodes. Many prediction methods have been proposed in the machine learning
    community over the past few years. In this thesis we show that several such proposals
    are equivalent to kriging predictors based on a fixed covariance matrix driven by the
    link structure of the graph. We then propose a data-driven estimator of the correlation
    structure that exploits patterns among the observed response values. We also show
    how we can scale some of the algorithms to large graphs. Finally, we investigate
    the fundamental smoothness assumption underlying many prediction methods by
    exploring some normality properties arising from empirical data analysis.

  • -

    -

  • -

    -

Licenses & Certifications

Publications

  • Evaluating Mobile Apps with A/B and Quasi A/B Tests

    KDD 2016

    We have seen an explosive growth of mobile usage, particularly on mobile apps. It is more important than ever to be able to properly evaluate mobile app release. A/B testing is a standard framework to evaluate new ideas. We have seen much of its applications in the online world across the industry [9,10,12]. Running A/B tests on mobile apps turns out to be quite different, and much of it is attributed to the fact that we cannot ship code easily to mobile apps other than going through a lengthy…

    We have seen an explosive growth of mobile usage, particularly on mobile apps. It is more important than ever to be able to properly evaluate mobile app release. A/B testing is a standard framework to evaluate new ideas. We have seen much of its applications in the online world across the industry [9,10,12]. Running A/B tests on mobile apps turns out to be quite different, and much of it is attributed to the fact that we cannot ship code easily to mobile apps other than going through a lengthy build, review and release process. Mobile infrastructure and user behavior differences also contribute to how A/B tests are conducted differently on mobile apps, which will be discussed in details in this paper. In addition to measuring features individually in the new app version through randomized A/B tests, we have a unique opportunity to evaluate the mobile app as a whole using the quasi-experimental framework [21]. Not all features can be A/B tested due to infrastructure changes and wholistic product redesign. We propose and establish quasi-experiment techniques for measuring impact from mobile app release, with results shared from a recent major app launch at LinkedIn.

    Other authors
  • From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks

    KDD 2015

    A/B testing, also known as bucket testing, split testing, or controlled experiment, is a standard way to evaluate user engagement or satisfaction from a new service, feature, or product. It is widely used among online websites, including social network sites such as Facebook, LinkedIn, and Twitter to make data-driven decisions. At LinkedIn, we have seen tremendous growth of controlled experiments over time, with now over 400 concurrent experiments running per day. General A/B testing…

    A/B testing, also known as bucket testing, split testing, or controlled experiment, is a standard way to evaluate user engagement or satisfaction from a new service, feature, or product. It is widely used among online websites, including social network sites such as Facebook, LinkedIn, and Twitter to make data-driven decisions. At LinkedIn, we have seen tremendous growth of controlled experiments over time, with now over 400 concurrent experiments running per day. General A/B testing frameworks and methodologies, including challenges and pitfalls, have been discussed extensively in several previous KDD work. In this paper, we describe in depth the experimentation platform we have built at LinkedIn and the challenges that arise particularly when running A/B tests at large scale in a social network setting. We start with an introduction of the experimentation platform and how it is built to handle each step of the A/B testing process at LinkedIn, from designing and deploying experiments to analyzing them. It is then followed by discussions on several more sophisticated A/B testing scenarios, such as running offline experiments and addressing the network effect, where one user’s action can influence that of another. Lastly, we talk about features and processes that are crucial for building a strong experimentation culture.

    Other authors
  • Network A/B Testing: From Sampling to Estimation

    WWW 2015

    A/B testing, also known as bucket testing, split testing, or controlled experiment, is a standard way to evaluate user engagement or satisfaction from a new service, feature, or product. It is widely used in online websites, including social network sites such as Facebook, LinkedIn, and Twitter to make data-driven decisions. The goal of A/B testing is to estimate the treatment effect of a new change, which becomes intricate when users are interacting, i.e., the treatment effect of a user may…

    A/B testing, also known as bucket testing, split testing, or controlled experiment, is a standard way to evaluate user engagement or satisfaction from a new service, feature, or product. It is widely used in online websites, including social network sites such as Facebook, LinkedIn, and Twitter to make data-driven decisions. The goal of A/B testing is to estimate the treatment effect of a new change, which becomes intricate when users are interacting, i.e., the treatment effect of a user may spill over to other users via underlying social connections. When conducting these online controlled experiments, it is a common practice to make the Stable Unit Treatment Value Assumption (SUTVA) that each individual’s response is affected by their own treatment only. Though this assumption simplifies the estimation of treatment effect, it does not hold when network interference is present, and may even lead to wrong conclusion.
    In this paper, we study the problem of network A/B testing in real networks, which have substantially different characteristics from the simulated random networks studied in previous works. We first examine the existence of network effect in a recent online experiment conducted at LinkedIn; Secondly, we propose an efficient and effective estimator for Average Treatment Effect (ATE) considering the interference between users in real online experiments; Finally, we apply our method in both simulations and a real world online experiment. The simulation results show that our estimator achieves better performance with respect to both bias and variance reduction. The real world online experiment not only demonstrates that large-scale network A/B test is feasible but also further validates many of our observations in the simulation studies.

    Other authors
    • Huan Gui
    • Anmol Bhasin
    • Jiawei Han
  • Seven Rules of Thumb for Web Site Experimenters

    KDD 2014

    Web site owners, from small web sites to the largest properties that include Amazon, Facebook, Google, LinkedIn, Microsoft, and Yahoo, attempt to improve their web sites, optimizing for criteria ranging from repeat usage, time on site, to revenue. Having been involved in running thousands of controlled experiments at Amazon, Booking.com, LinkedIn, and multiple Microsoft properties, we share seven rules of thumb for experimenters, which we have generalized from these experiments and their…

    Web site owners, from small web sites to the largest properties that include Amazon, Facebook, Google, LinkedIn, Microsoft, and Yahoo, attempt to improve their web sites, optimizing for criteria ranging from repeat usage, time on site, to revenue. Having been involved in running thousands of controlled experiments at Amazon, Booking.com, LinkedIn, and multiple Microsoft properties, we share seven rules of thumb for experimenters, which we have generalized from these experiments and their results. These are principles that we believe have broad applicability in web optimization and analytics outside of controlled experiments, yet they are not provably correct, and in some cases exceptions are known.

    To support these rules of thumb, we share multiple real examples, most being shared in a public paper for the first time. Some rules of thumb have previously been stated, such as “speed matters,” but we describe the assumptions in the experimental design and share additional experiments that improved our understanding of where speed matters more: certain areas of the web page are more critical.

    This paper serves two goals. First, it can guide experimenters with rules of thumb that can help them optimize their sites. Second, it provides the KDD community with new research challenges on the applicability, exceptions, and extensions to these, one of the goals for KDD’s industrial track.

    Other authors
    See publication
  • Online Controlled Experiments at Large Scale

    KDD 2013

    Web-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. At Microsoft’s Bing, the use of controlled experiments has grown exponentially over time, with over 200 concurrent experiments now running on any given day. Running experiments at large scale requires addressing multiple challenges in three areas:…

    Web-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. At Microsoft’s Bing, the use of controlled experiments has grown exponentially over time, with over 200 concurrent experiments now running on any given day. Running experiments at large scale requires addressing multiple challenges in three areas: cultural/organizational, engineering, and trustworthiness. On the cultural and organizational front, the larger organization needs to learn the reasons for running controlled experiments and the tradeoffs between controlled experiments and other methods of evaluating ideas. We discuss why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits. On the engineering side, we architected a highly scalable system, able to handle data at massive scale: hundreds of concurrent experiments, each containing millions of users. Classical testing and debugging techniques no longer apply when there are millions of live variants of the site, so alerts are used to identify issues rather than relying on heavy up-front testing. On the trustworthiness front, we have a high occurrence of false positives that we address, and we alert experimenters to statistical interactions between experiments. The Bing Experimentation System is credited with having accelerated innovation and increased annual revenues by hundreds of millions of dollars, by allowing us to find and focus on key ideas evaluated through thousands of controlled experiments. A 1% improvement to revenue equals $10M annually in the US, yet many ideas impact key metrics by 1% and are not well estimated a-priori. The system has also identified many negative features that we avoided deploying, despite key stakeholders’ early excitement, saving us similar large amounts

    Other authors
    See publication
  • Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data

    WSDM 2013: The Sixth ACM International Conference on Web Search and Data Mining

    Online controlled experiments are at the heart of making data-driven decisions at a diverse set of companies, including Amazon, eBay, Facebook, Google, Microsoft, Yahoo, and Zynga. Small differences in key metrics, on the order of fractions of a percent, may have very significant business implications. At Bing it is not uncommon to see experiments that impact annual revenue by millions of dollars, even tens of millions of dollars, either positively or negatively. With thousands of experiments…

    Online controlled experiments are at the heart of making data-driven decisions at a diverse set of companies, including Amazon, eBay, Facebook, Google, Microsoft, Yahoo, and Zynga. Small differences in key metrics, on the order of fractions of a percent, may have very significant business implications. At Bing it is not uncommon to see experiments that impact annual revenue by millions of dollars, even tens of millions of dollars, either positively or negatively. With thousands of experiments being run annually, improving the sensitivity of experiments allows for more precise assessment of value, or equivalently running the experiments on smaller populations (supporting more experiments) or for shorter durations (improving the feedback cycle and agility). We propose an approach (CUPED) that utilizes data from the pre-experiment period to reduce metric variability and hence achieve better sensitivity. This technique is applicable to a wide variety of key business metrics, and it is practical and easy to implement. The results on Bing’s experimentation system are very successful: we can reduce variance by about 50%, effectively achieving the same statistical power with only half of the users, or half the duration.

    Other authors
    See publication
  • Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained

    KDD 2012

    Online controlled experiments are often utilized to make data-driven decisions at Amazon, Microsoft, eBay, Facebook, Google, Yahoo, Zynga, and at many other companies. While the theory of a controlled experiment is simple, and dates back to Sir Ronald A. Fisher’s experiments at the Rothamsted Agricultural Experimental Station in England in the 1920s, the deployment and mining of online controlled experiments at scale—thousands of experiments now—has taught us many lessons. These exemplify the…

    Online controlled experiments are often utilized to make data-driven decisions at Amazon, Microsoft, eBay, Facebook, Google, Yahoo, Zynga, and at many other companies. While the theory of a controlled experiment is simple, and dates back to Sir Ronald A. Fisher’s experiments at the Rothamsted Agricultural Experimental Station in England in the 1920s, the deployment and mining of online controlled experiments at scale—thousands of experiments now—has taught us many lessons. These exemplify the proverb that the difference between theory and practice is greater in practice than in theory. We present our learnings as they happened: puzzling outcomes of controlled experiments that we analyzed deeply to understand and explain. Each of these took multiple-person weeks to months to properly analyze and get to the often surprising root cause. The root causes behind these puzzling results are not isolated incidents; these issues generalized to multiple experiments. The heightened awareness should help readers increase the trustworthiness of the results coming out of controlled experiments. At Microsoft’s Bing, it is not uncommon to see experiments that impact annual revenue by millions of dollars, thus getting trustworthy results is critical and investing in understanding anomalies has tremendous payoff: reversing a single incorrect decision based on the results of an experiment can fund a whole team of analysts. The topics we cover include: the OEC (Overall Evaluation Criterion), click tracking, effect trends, experiment length and power, and carryover effects.

    Other authors
    See publication
  • CUR from a Sparse Optimization Viewpoint

    Advances in Neural Information Processing Systems 23 (NIPS 2010)

    The CUR decomposition provides an approximation of a matrix X that has low
    reconstruction error and that is sparse in the sense that the resulting approximation
    lies in the span of only a few columns of X. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic
    approach, whereas most sparse PCA methods are framed as convex optimization
    problems. In this paper, we try to understand CUR from a sparse optimization
    viewpoint…

    The CUR decomposition provides an approximation of a matrix X that has low
    reconstruction error and that is sparse in the sense that the resulting approximation
    lies in the span of only a few columns of X. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic
    approach, whereas most sparse PCA methods are framed as convex optimization
    problems. In this paper, we try to understand CUR from a sparse optimization
    viewpoint. We show that CUR is implicitly optimizing a sparse regression objective and, furthermore, cannot be directly cast as a sparse PCA method. We also
    observe that the sparsity attained by CUR possesses an interesting structure, which
    leads us to formulate a sparse PCA method that achieves a CUR-like sparsity.

    Other authors
    • Michael W. Mahoney
    • Jacob Bien
    See publication

Patents

  • LINKEDIN: POST EXPERIMENT POWER

    Issued US 62/140,305

    Other inventors
  • MOST IMPACTFUL EXPERIMENTS

    Filed US

    Other inventors

Languages

  • Chinese

    -

Recommendations received

View Ya’s full profile

  • See who you know in common
  • Get introduced
  • Contact Ya directly
Join to view full profile

People also viewed

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Ya Xu in United States

Add new skills with these courses