Bayesian Modeling with PYMC3
Bayesian Modeling with PYMC3 to predict Dividends ; A classic small data problem.
Outline/Structure of the Talk
- Business Context
- What are Dividends
- Why Predict Dividends
- Why Bayesian
- Bayesian Paradigm
- Bayesian Regression
Participants would learn how Bayesian modeling was used in the case of small data
All data enthusiasts
Prerequisites for Attendees
Basic Knowledge of Regression
schedule Submitted 1 week ago
People who liked this proposal, also liked:
Subhasish Misra - Causal data science: Answering the crucial ‘why’ in your analysis.Subhasish MisraStaff Data ScientistWalmart Labs
schedule 4 weeks agoSold Out!
Causal questions are ubiquitous in data science. For e.g. questions such as, did changing a feature in a website lead to more traffic or if digital ad exposure led to incremental purchase are deeply rooted in causality.
Randomized tests are considered to be the gold standard when it comes to getting to causal effects. However, experiments in many cases are unfeasible or unethical. In such cases one has to rely on observational (non-experimental) data to derive causal insights. The crucial difference between randomized experiments and observational data is that in the former, test subjects (e.g. customers) are randomly assigned a treatment (e.g. digital advertisement exposure). This helps curb the possibility that user response (e.g. clicking on a link in the ad and purchasing the product) across the two groups of treated and non-treated subjects is different owing to pre-existing differences in user characteristic (e.g. demographics, geo-location etc.). In essence, we can then attribute divergences observed post-treatment in key outcomes (e.g. purchase rate), as the causal impact of the treatment.
This treatment assignment mechanism that makes causal attribution possible via randomization is absent though when using observational data. Thankfully, there are scientific (statistical and beyond) techniques available to ensure that we are able to circumvent this shortcoming and get to causal reads.
The aim of this talk, will be to offer a practical overview of the above aspects of causal inference -which in turn as a discipline lies at the fascinating confluence of statistics, philosophy, computer science, psychology, economics, and medicine, among others. Topics include:
- The fundamental tenets of causality and measuring causal effects.
- Challenges involved in measuring causal effects in real world situations.
- Distinguishing between randomized and observational approaches to measuring the same.
- Provide an introduction to measuring causal effects using observational data using matching and its extension of propensity score based matching with a focus on the a) the intuition and statistics behind it b) Tips from the trenches, basis the speakers experience in these techniques and c) Practical limitations of such approaches
- Walk through an example of how matching was applied to get to causal insights regarding effectiveness of a digital product for a major retailer.
- Finally conclude with why understanding having a nuanced understanding of causality is all the more important in the big data era we are into.
Juan Manuel Contreras - Beyond Individual Contribution: How to Lead Data Science TeamsJuan Manuel ContrerasHead of Data ScienceEven
schedule 1 week agoSold Out!
Despite the increasing number of data scientists who are being asked to take on managerial and leadership roles as they grow in their careers, there are still few resources on how to manage data scientists and lead data science teams. There is also scant practical advice on how to serve as head of a data science practice: how to set a vision and craft a strategy for an organization to use data science.
In this talk, I will describe my experience as a data science leader both at a political party (the Democratic Party of the United States of America) and at a fintech startup (Even.com), share lessons learned from these experiences and conversations with other data science leaders, and offer a framework for how new data science leaders can better transition to both managing data scientists and heading a data science practice.
debapriya das - AI in Martech - Solving the riddle of 4R'sdebapriya dasLead Machine Learning @ SmartechNetcore Solutions
schedule 1 month agoSold Out!
In this digital era when the attention span of customers is reducing drastically, for a marketer it is imperative to understand the following 4 aspects more popularly known as "The 4R's of Marketing" if they want to increase our ROI:
- Right Person
- Right Time
- Right Content
- Right Channel
Only when we design and send our campaigns in such a way, that it reaches the right customers at the right time through the right channel telling them about stuffs they like or are interested in ... can we expect higher conversions with lower investment. This is a problem that most of the organizations need to solve for to stay relevant in this age of high market competition.
Among all these we will put special focus on appropriate content generation based on targeted user base using Markov based models and do a quick hack session.
The time breakup can be:
5 mins : Difference between Martech and traditional marketing. The 4R's of marketing and why solving for them is crucial
5 mins : What is Smart Segments and how to solve for it, with a short demo
5 mins : How marketers use output from Smart Segments to execute targeted campaigns
5 mins: What is STO, how it can be solved and what is the performance uplift seen by clients when they use it
5 mins: What is Channel Optimization, how it can be solved and what is the performance uplift seen by clients when they use it
5 mins: Why sending the right message to customers is crucial, and introduction to appropriate content creation
15 mins: Covering different Text generation nuances, and a live demo with walk through of a toy code implementation
Pushker Ravindra - Data Science Best Practices for R and PythonPushker RavindraData Analytics LeadMonsanto/Bayer
schedule 1 week agoSold Out!
How many times did you feel that you were not able to understand someone else’s code or sometimes not even your own? It’s mostly because of bad/no documentation and not following the best practices. Here I will be demonstrating some of the best practices in Data Science, for R and Python, the two most important programming languages in the world for Data Science, which would help in building sustainable data products.
- Integrated Development Environment (RStudio, PyCharm)
- Coding best practices (Google’s R Style Guide and Hadley’s Style Guide, PEP 8)
- Linter (lintR, Pylint)
- Documentation – Code (Roxygen2, reStructuredText), README/Instruction Manual (RMarkdown, Jupyter Notebook)
- Unit testing (testthat, unittest)
- Version control (Git)
These best practices reduce technical debt in long term significantly, foster more collaboration and promote building of more sustainable data products in any organization.
Siboli mukherjee - Real time Anomaly Detection in Network KPI using Time SeriesSiboli mukherjeeTelecom professionalVodafone Idea Ltd
schedule 1 week agoSold Out!
How to accurately detect Key Performance Indicator (KPI) anomalies is a critical issue in cellular network management. In this talk I shall introduce CNR(Cellular Network Regression) a unified performance anomaly detection framework for KPI time-series data. CNR realizes simple statistical modelling and machine-learning-based regression for anomaly detection; in particular, it specifically takes into account seasonality and trend components as well as supports automated prediction model retraining based on prior detection results. I demonstrate here how CNR detects two types of anomalies of practical interest, namely sudden drops and correlation changes, based on a large-scale real-world KPI dataset collected from a metropolitan LTE network. I explore various prediction algorithms and feature selection strategies, and provide insights into how regression analysis can make automated and accurate KPI anomaly detection viable.
Index Terms—anomaly detection, NPAR (Network Performance Analysis)
The continuing advances of cellular network technologies make high-speed mobile Internet access a norm. However, cellular networks are large and complex by nature, and hence production cellular networks often suffer from performance degradations or failures due to various reasons, such as back- ground interference, power outages, malfunctions of network elements, and cable disconnections. It is thus critical for network administrators to detect and respond to performance anomalies of cellular networks in real time, so as to maintain network dependability and improve subscriber service quality. To pinpoint performance issues in cellular networks, a common practice adopted by network administrators is to monitor a diverse set of Key Performance Indicators (KPIs), which provide time-series data measurements that quantify specific performance aspects of network elements and resource usage. The main task of network administrators is to identify any KPI anomalies, which refer to unexpected patterns that occur at a single time instant or over a prolonged time period.
Today’s network diagnosis still mostly relies on domain experts to manually configure anomaly detection rules such a practice is error-prone, labour intensive, and inflexible. Recent studies propose to use (supervised) machine learning for anomaly detection in cellular networks . ellular networks, a common practice adopted by network administrators is to monitor a diverse set of Key Performance Indicators (KPIs), which provide time-series data measurements that quantify specific performance aspects of network elements and resource usage. The main task of network administrators is to identify any KPI anomalies, which refer to unexpected patterns that occur at a single time instant or over a prolonged time period.
Today’s network diagnosis still mostly relies on domain experts to manually configure anomaly detection rules such a practice is error-prone, labour intensive, and inflexible. Recent studies propose to use (supervised) machine learning for anomaly detection in cellular networks .