TCE Conference 2013 – Speakers
Leon Bottou (Microsoft Research)
Yishay Mansour (Tel Aviv U and Microsoft Research)
Andrei Broder (Google)
Shaul Markovitch (Technion)
Freddy Bruckstein (Technion)
Andrew Ng (Stanford)
Michael Elad (Technion)
Fernando Pereira (Google)
Yaakov Engel and Amir Navot (Rafael)
Dana Ron (Tel Aviv U)
Oren Etzioni (UW)
Yoram Singer (Google)
Shai Fine (Intel)
Aya Soffer (IBM)
Yoav Freund (UCSD)
Gadi Solotorevsky (Cvidya)
Nir Friedman (Hebrew U)
Naftali Tishby (Hebrew U)
Tom Heiser (RSA)
Jeffrey Ullman (Stanford)
Gadi Kimmel (Final)
Elad Yom-Tov (Microsoft Research)
Ronny Lempel (Yahoo! Labs)
Léon Bottou received the Diplôme d'Ingénieur de l'École Polytechnique (X84) in 1987, the Magistère de Mathématiques Fondamentales et Appliquées et d'Informatique from École Normale Superieure in 1988, and a Ph.D. in Computer Science from Université de Paris-Sud in 1991. Léon joined AT&T Bell Laboratories in 1991 and went on to AT&T Labs Research and NEC Labs America. He joined the Science team of Microsoft adCenter in 2010 and Microsoft Research in 2012. Léon's primary research interest are machine learning. Léon's secondary research interest is data compression and coding. His best known contributions are his work on large scale learning and on the DjVu document compression technology.
Statistical machine learning technologies in the real world are never without a purpose. Using their predictions, humans or machines make decisions whose circuitous consequences often violate the modeling assumptions that justified the system design in the first place. Such contradictions appear very clearly in computational advertisement systems. The design of the ad placement engine directly influences the occurrence of clicks and the corresponding advertiser payments. It also has important indirect effects : (a) ad placement decisions impact the satisfaction of the users and therefore their willingness to frequent this web site in the future, (b) ad placement decisions impact the return on investment observed by the advertisers and therefore their future bids, and (c) ad placement decisions change the nature of the data collected for training the statistical models in the future.
Popular theoretical approaches, such as auction theory or multi-armed bandits, only address selected aspects of such a system. In contrast, the language and the methods of causal inference provide a full set of tools to answer the vast collection of questions facing the designer of such a system. Is it useful to pass new input signals to the statistical models? Is it worthwhile to collect and label a new training set? What about changing the loss function or the learning algorithm? In order to answer such questions, one needs to unravel how the information produced by the statistical models traverses the web of causes and consequences and produces measurable losses and rewards.
This talk provides a real world example demonstrating the value of causal inference for large-scale machine learning. It also illustrates a collection of practical counterfactual analysis techniques applicable to many real-life machine learning systems, including causal inferences techniques applicable to continuously valued variables with meaningful confidence intervals, and quasi-static analysis techniques for estimating how small interventions affect certain causal equilibria. In the context of computational advertisement, this analysis elucidates the connection between auction theory and machine learning.
Andrei Broder is a Google Distinguished Scientist. From 2005 to 2012 he was a Fellow and VP for Computational Advertising at Yahoo!. Prior to this, he worked at IBM as a Distinguished Engineer and the CTO of the Institute for Search and Text Analysis and at AltaVista as VP for Research and Chief Scientist. He was graduated Summa cum Laude from Technion and obtained his M.Sc. and Ph.D. at Stanford under Don Knuth. Broder has authored more than a hundred papers and was awarded thirty-nine US patents. His current research interests are centered on personalization, computational advertising, web search, context-driven information supply, and randomized algorithms. He is a member of the US National Academy of Engineering, a Fellow of ACM and of IEEE, and a co-winner of the 2012 ACM Paris Kanellakis Theory and Practice Award for "groundbreaking work on Locality-Sensitive Hashing".
Ten years ago we proposed an efﬁcient query evaluation process consisting of two stages: ﬁrst, a fast, approximate evaluation on candidate documents; then, a full, but slow evaluation of promising candidates. The first stage was based on a then-new Boolean predicate called WAND, standing for “Weak AND” or “Weighted AND”. WAND takes as arguments a list of Boolean variables X1, X2,…,Xk, a list of associated positive weights, w1, w2,…,wk, and a threshold θ, and returns TRUE iff the weighted sum of the variables exceeds the threshold. Using WAND as a primitive in large scale search engine makes the computation of the top-k scoring documents according to the formula above very efficient and allows for a wide range of efficiency vs. accuracy tradeoffs.
Given the popularity of search engines using inverted indices over billions of documents and the complex machine learned models used for ranking, WAND and its generalizations have proved useful for many “big data” purposes: complex queries, uniform sampling of matching documents, classification, etc. In this talk, we will start by presenting the WAND operator and explaining the reasons for its efficiency. We will then briefly survey some of the applications, extensions, and improvements to WAND-based retrieval that emerged in the last decade. Finally we will discuss in more detail a recent result whereby we reduce the cost of the classic k-means algorithm to allow it to scale to very a high number of clusters by adopting WAND-style ranked retrieval techniques. The key insight is to invert the usual k-means process of point-to-centroid assignment by building an index over the given points and using the current centroids as queries to decide on the cluster membership.
The original WAND paper is joint work with David Carmel, Michael Herscovici, Aya Soffer, and Jason Zien. The clustering paper is joint work with Lluis Garcia-Pueyo, Vanja Josifovski, Sergei Vassilvitskii, and Srihari Venkatesan.
Alfred M. Bruckstein is Ollendorff Professor of Science in the Computer Science Department of the Technion.
He holds BSc and MSc degrees in EE from the Technion (1976 and 1980), and his PhD is from Stanford University (1984).
His research is in Image and Signal Processing, Ant Robotics and Applied Geometry. He was the Dean of the Technion
Graduate School, and the Head of Technion's Undergraduate Excellence Program. Six of his PhD students are currently
Professors at top Universities in Israel and the USA.
Michael Elad received his B.Sc. (1986), M.Sc. (1988) and D.Sc. (1997) from the department of Electrical engineering at the Technion, Israel. Since 2003 Michael is a faculty member at the Computer-Science department at the Technion, and since 2010 he holds a full-professorship position.
Michael Elad works in the field of signal and image processing, specializing in particular on inverse problems, sparse representations and super-resolution. Michael received the Technion's best lecturer award seven times, he is the recipient of the 2007 Solomon Simon Mani award for excellence in teaching, the 2008 Henri Taub Prize for academic excellence, the 2010 and the 2013 Hershel-Rich prizes for innovation. Michael is an IEEE Fellow since 2012. He is serving as an editor for SIAM SIIMS, IEEE-TIT, ACHA, and IEEE SPL.
Images, video, audio, text documents, financial data, medical information, traffic info – all these and many others are data sources that can be effectively processed. Why? Is it obvious? In this talk we will start by discussing "modeling" of data as a way to enable their actual processing, putting emphasis on sparsity-based models. We will turn our attention to graph-structured data and propose a tailored sparsifying transform for its dimensionality reduction and subsequent processing. We shall conclude by showing how this new transform becomes relevant and powerful in revisiting … classical image processing tasks.
Yaakov Engel received his B.Sc. (magna cum laude) in Physics in 1994 from Tel-Aviv University. During 1995-2005 he worked for several startup companies, and in 2005 he received his Ph.D. from the Hebrew University’s Interdisciplinary Center for Neural Computation (ICNC). During 2005-2007 he was an Alberta Ingenuity fellow at the University of Alberta’s center for machine learning (AICML). Since 2008 he is employed by Rafael, advanced defense systems. His interests include machine learning and statistical estimation, with a focus on industrial applications thereof. He grows his own vegetables.
Amir Navot received his B.Sc. (summa cum laude) in Mathematics in 1997 from Tel-Aviv University. In 2007 he received his Ph.D. from the Hebrew University’s Interdisciplinary Center for Neural Computation (ICNC). During 1997-1998 he worked for Schema Ltd. and during 1998-2003 for Banter Ltd. In 2005 he also served as an lecturer in Tel-Hai Academic College. Since 2006 he is employed by Rafael, advanced defense systems. His interests include machine learning and particularly feature selection, with a focus on integrating machine learning into complicated mission-critical systems.
Classifiers used in mission-critical applications, where misclassification errors incur high costs, should be robust to training-set artifacts, such as insufficient or misrepresentative coverage, and be tolerant to high levels of missing values. As such, they are required to support intensive designer-control, and a range of validation procedures that must go beyond cross-validation. For such applications, we advocate the use of a family of classifiers that employ a factored model of the posterior class probabilities. These classifiers are simple, interpretable, allow their designers to enforce a variety of domain-specific constraints, and can handle missing data both during training and at prediction time. Such classifiers are also capable of explaining their decisions in terms of the basic measured quantities. Due to the factored nature of the classifier, the learning algorithms are highly efficient and parallelizable, allowing us to tackle big data.
Oren Etzioni grew up in Israel, graduated from Harvard in 1986, and received his PhD from Carnegie Mellon in 1991. He is the WRF Entrepreneurship Professor of Computer Science at the University of Washington. He is the author of over 200 technical papers, cited over 18,000 times, and was selected as a AAAI Fellow for his contributions to software agents, Web-based technologies, and intelligent user interfaces. He is the founder of three companies focused on increased transparency for shoppers. His first company, Netbot, was the first online comparison shopping company (acquired by Excite in 1997). His second company, Farecast, advised travelers when to buy their air tickets. Farecast was acquired by Microsoft in 2008 and became the foundation for Bing Travel. Decide.com, founded in 2010, utilizes Cutting-edge data- and text- mining methods to minimize buyer’s remorse.
Abstract: The Future of Semantic Web Search
In the mobile world, “10 blue links” are increasingly unsatisfying. Google’s Knowledge Graph and Facebook’s Graph Search demonstrate how Web search is evolving from document search to Question Answering. My talk will outline the next phase in this evolution with a focus on “Open Information Extraction”—the scaling of information extraction to the Web—and its implications for Web search.
Shai Fine is a Principal Engineer at the Advanced Analytics group in Intel, focusing on Machine Learning, Business Intelligence, and Big Data. Prior to Intel, Shai worked for the IBM Research Lab in Haifa, managing the Analytics research department. Shai received his Ph.D. in 1999 in computer science, from the Hebrew University in Israel, and conducted his postdoctoral research at the Human Language Technologies department in IBM’s T.J. Watson Research Center in New York. Shai has published over 30 papers in referred journals and conference proceedings, and co-invented 10 patents in various domains of Advanced Analytics.
The exponential growth of data makes the famous (paraphrased) quote by Naisbitt and Rogers relevant more than ever. The Big Data era have led to the emergence of many tools and techniques designed to cope with the enormous amount of data and translate it into an accessible and valuable information. But moving beyond retrievable information to the construction of new content, predictive outcomes, and prescription, are still major challenges. In this talk I will discuss the role of advanced analytics in transforming the accumulated information to an (actionable) knowledge that provide real business value, will review some leading themes, best practices, a few success stories, and shortly discuss some of the key challenges ahead of us.
Yoav Freund is a professor of Computer Science and Engineering at UC San Diego. His work is in the area of machine learning, computational statistics and their applications in biology, image processing and signal processing. Dr. Freund is an internationally known researcher in the field of machine learning — a field which bridges computer science and statistics. He is best known for his joint work with Dr. Robert Schapire on the Adaboost algorithm. For this work they were awarded the 2003 Gödel prize in Theoretical Computer Science and the Kanellakis Prize in 2004. In 2008 Dr. Freund was elected as a AAAI Fellow.
Nir Friedman earned his B.Sc in Mathematics and Computer Science from Tel-Aviv University (1987) and his M.Sc from the Weizmann Institute of Science (1992). In 1997, he completed his Ph.D. at Stanford under the supervision of Joseph Halpern. After postdoctoral fellowship at the University of California, Berkeley, he accepted a faculty position at the School of Computer Science, the Hebrew University of Jerusalem.
His highly-cited research includes work on Bayesian network classifiers, Bayesian Structural EM, and the use of Bayesian methods to analyzing gene expression data. More recent works focus on Probabilistic Graphical Models, reconstructing Regulatory Networks, Genetic Interactions, and the role of Chromatin in Transcriptional Regulation.
In 2009, Friedman and Daphne Koller published a textbook on Probabilistic Graphical Models. Later that year, he joined the Institute of Life Sciences, and opened an experimental lab where he uses advanced robotic tools to study transcriptional regulation in the yeast. His current interest is in understanding the interactions between chromatin state and transcription regulatory pathways, with the aim of understanding the principles for transcriptional regulation and cellular memory.
A central question in molecular biology is understanding how cells process information and decide how to react. A crucial level of regulation is on gene transcription, the first step in producing the protein that the gene encodes. Transcriptional regulation is crucial for defining the cell’s identity and its ability to function. The main dogma is that regulatory “instructions” are part of the genetic blueprint encoded in the genome, the sequence of DNA base-pairs. In recent years there is growing evidence for additional layers of information that are passed from a cell to its daughter cells not through the DNA sequence. One of these layers is chromatin, the protein-DNA complex that forms chromosomes. The basic unit of chromatin is a nucleosome, around which about ~150 base pairs of DNA are wound. Each nucleosome can be modified by addition of multiple discrete marks, which in turn can be recognized by specific regulatory proteins that modify nucleosomes or impact transcription. As such nucleosomes serve as a substrate for recording information by regulatory elements and reading it by others, and for passing information to daughter cells following cell division.
These new discoveries raise basic questions of what does chromatin state encodes, how it is maintained, updated, passed to next generations, and how it interact with transcription. The research to answer these questions relies on new methodologies that collect massive amount of data about chromatin state in each location along the genome. In this talk I will provide an overview of the field and describe ongoing investigations that attempt to answer these questions.
President, RSA, The Security Division of EMC
Tom Heiser is President of RSA, The Security Division of EMC, a position he has held since February 2011. He reports to Art Coviello, Executive Vice President, EMC Corporation and Executive Chairman, RSA, and to David Goulden, EMC President and Chief Operating Officer. With 3,000 employees worldwide, more than 30,000 customers, and over 25 years in the industry, RSA is the acknowledged leader in information security and a vital component of EMC’s strategy to help customers store, manage, protect, and analyze their most valuable asset – information – in a more agile, trusted, and cost-efficient way.
In his role as President, Heiser oversees all aspects of RSA’s business operations, including worldwide sales and services, channel strategy, product development,marketing, strategic business and financial initiatives, IT, and technical support. He joined RSA in July 2008 as Senior Vice President of Global Customer Operations and was promoted to COO in April 2010.
Heiser helped form EMC’s Cloud Infrastructure and Services Division and was Senior Vice President and General Manager of the company’s Centera Business Unit, launching a new storage category and making Centera one of EMC’s fastest-growing product lines. He also served as EMC’s Senior Vice President, Corporate Development and New Ventures, leading the team responsible for mergers and acquisitions, including the acquisition of RSA in 2006. Earlier in his career he served as the product manager for EMC’s first storage system, Orion, predecessor of the company’s industry-leading Symmetrix systems.
Heiser joined EMC in 1984, upon college graduation, in the company’s first sales-training class. Later that year, he opened EMC’s first sales office in New York. In 1996, despite EMC’s predominately direct sales model, Heiser established the company’s first significant channel sales effort. Over the five years he ran the channel organization, sales grew from $8 million to more than $2 billion globally and the EMC partner network expanded to include resellers, VARs, ISVs, OEMs, and system integrators worldwide.
As EMC’s fourteenth employee, Heiser is passionate about perpetuating the company’s history and culture with employees and is an invaluable inspiration. He earned his bachelor’s degree in accounting information systems from the University of Massachusetts.
Mobility, cloud computing and the growth of consumer-driven IT have transformed how digital services are created and delivered. But while they’ve opened the enterprise to new possibilities for collaboration, communication and productivity, they have also opened it up to new risks: adversaries of all types are finding new ways to exploit the hyper-connectivity of our digital society for their own illicit purposes. Tom Heiser will outline the trends driving the transformation of digital technology and the threat landscape in which those technologies must operate, and explain how new data-driven approaches to security promise to put the balance of power back into the hands of security professionals. He will describe how Intelligence-driven security, powered by big data and analytics, offers new ways to provide security for today’s open, borderless enterprises, without sacrificing convenience or agility.
Gad Kimmel earned his MS (1998) from the Technion, and PhD in Computer Science from Tel Aviv University (2006). His PhD work was on computational methods in modern human genetics. After completing his PhD, he was a postdoctoral research fellow in UC Berkeley for two years, hosted by Richard Karp and Michael Jordan. His main research interests include statistical learning and applied machine learning. Currently, he is a senior researcher at Final, a company that specializes in the development of trading algorithms in financial markets.
Algorithmic Trading (AT) and High Frequency Trading (HFT) has evolved rapidly over the last decade, and nowadays is in a range of 30% – 60% of all trades in the various financial markets. This market, estimated in a yearly size of about $10 billion, is divided among a group of few dozen major companies. As technology progresses, faster computers with faster communication abilities are being used both in trading and in the infrastructure of the exchanges. This increases the amounts of data needed to be handled, processed and reacted to within microseconds. As a result, algorithmic trading companies are directed into strong technological efforts, with the algorithmic ability to cope and process larger amounts of signals, coordinated around the world. The profile of these firms is on one side, algorithmic, using cutting edge statistical and machine learning theory, combined with the other side of a very strong software and hardware engineering.
In this talk, I will introduce the main financial instruments used in financial markets and their roles, and go over the history of the algorithmic trading industry. Next, I will present some of the technological and mathematical challenges we have in the big data world of current financial markets.
Ronny Lempel joined Yahoo! Research in October 2007 to open and establish its lab in Haifa, Israel. Since joining Yahoo!, Ronny has led R&D activities in diverse areas, including Web Search, Page Optimization and Recommender Systems. In January 2013 Ronny was appointed Yahoo! Labs’ Chief Data Scientist. Prior to joining Yahoo!, Ronny spent 4.5 years at IBM's Haifa Research Lab with the Information Retrieval Group, where his duties included research and development in the area of enterprise search systems. Ronny received his PhD, which focused on search engine technology, from the Faculty of Computer Science at Technion, Israel Institute of Technology in early 2003. During his PhD studies, Ronny spent two summer internships at the AltaVista search engine.
Recommender Systems over large user and item bases constitute a canonical Big Data/Data Science application. The area has received much attention over the past 15 years, initially driven by eCommerce use-cases. This talk calls out several research challenges in the art of recommendation technology as applied in Web media sites. One particular characteristic of such recommendation settings is the relative low cost of falsely recommending an irrelevant item, which means that recommendation schemes can be less conservative and more exploratory. This also creates opportunities for better item cold-start handling. Other technical difficulties, especially for offline modeling given data describing users’ consumption of items, pertain to the bias introduced by the items the Web site actually offered to users. Also called out are tradeoffs between personalization and contextualization, novel schemes that aim at recommending sets and sequences of items, and the challenge of incrementally updating recommendation schemes given constantly arriving new interactions.
Tel Aviv University and Microsoft Research
Prof. Yishay Mansour got his PhD from MIT in 1990, following it he was a postdoctoral fellow in Harvard and a Research Staff Member in IBM T. J. Watson Research Center. Since 1992 he is at Tel-Aviv University, where he is currently a Professor of Computer Science and has serves as the first head of the School of Computer Science during 2000-2002. Prof. Mansour has a part-time position at Microsoft Reaserch in Israel, and has held visiting positions with Bell Labs, AT&T research Labs, IBM Research, and Google Research. He has mentored start-ups as Riverhead, which was acquired by Cisco, Ghoonet and Verix. Prof. Mansour has published over 50 journal papers and over 100 proceeding papers in various areas of computer science with special emphasis on communication networks, machine learning, and algorithmic game theory, and has supervised over a dozen graduate students in those areas. He is currently the director of the Israeli Center of Research Excellence in Algorithms.
Prof. Mansour is currently an associate editor in a number of distinguished journals and has been on numerous conference program committees. He was both the program chair of COLT (1998) and served on the COLT steering committee.
Shaul Markovitch received his B.Sc. from the Hebrew University, and his Ph.D. from the University of Michigan. He spent one year at General Motors Research and has since been a faculty member at the Computer Science Department at the Technion. He had served on various committees of Artificial Intelligence and Machine Learning, and served as area chair for AAAI 2011 and AAAI 2012. He is currently serving as an associated editor of the Journal of Artificial Intelligence Research. His areas of research include speedup learning, selective learning, feature generation, active learning, anytime learning, resource-bounded reasoning, multi-agent reasoning and learning, text classification, semantic relatedness, and natural language semantics.
Andrew Ng's research is in the areas of machine learning and artificial intelligence. Through building very large scale cortical (brain) simulations, he is developing algorithms that can learn to sense and perceive without needing to be explicitly programed. Using these techniques, he has developed sophisticated computer vision algorithms, as well as a variety of highly capable robots, such as by far the most advanced autonomous helicopter controller, that is able to fly spectacular aerobatic maneuvers. His group also developed ROS, which is today by far the most widely used open-source robotics software platform. In 2011, he also taught an online Machine Learning class to over 100,000 students, leading to his co-founding Coursera, which is offering high quality online courses. Ng has also been named to the 2013 "Time 100" list of the most influential people in the world.
Machine learning is a very successful technology, but applying it to a new problem usually means spending a long time hand-designing the input features to feed to the learning algorithm. This is true for applications in vision, audio, and text/NLP. To address this, researchers in machine learning have recently developed "deep learning" algorithms, which can automatically learn feature representations from unlabeled data, thus bypassing most of this time-consuming engineering. These algorithms are based on building massive artificial neural networks, that were loosely inspired by cortical (brain) computations. In this talk, I describe the key ideas behind deep learning, and also discuss the computational challenges of getting these algorithms to work. I'll also present a few case studies, and report on the results from a project that I led at Google to build massive deep learning algorithms, resulting in a highly distributed neural network trained on 16,000 CPU cores, and that learned by itself to discover high level concepts such as common objects in video.
Fernando Pereira is research director at Google. His previous positions include chair of the Computer and Information Science department of the University of Pennsylvania, head of the Machine Learning and Information Retrieval department at AT&T Labs, and research and management positions at SRI International. He received a Ph.D. in Artificial Intelligence from the University of Edinburgh in 1982, and he has over 120 research publications on computational linguistics, machine learning, bioinformatics, speech recognition, and logic programming, as well as several patents. He was elected AAAI Fellow in 1991 for contributions to computational linguistics and logic programming, and ACM Fellow in 2010 for contributions to machine-learning models of natural language and biological sequences. He was president of the Association for Computational Linguistics in 1993.
Advances in statistical and machine learning approaches to natural-language analysis have yielded a wealth of methods and applications in information retrieval, speech recognition, machine translation, and information extraction. Yet, even as we enjoy these advances, we recognize that our successes are to a large extent the result of clever exploitation of redundancy in language structure and use, allowing our algorithms to eke out a few useful bits that we can put to work in applications. By focusing on applications that extract a limited amount of information from the text, finer structures such as word order or syntactic structure could be largely ignored in information retrieval or speech recognition. However, by leaving out those finer details, our language-processing systems have been stuck in an "idiot savant" stage where they can find everything but cannot understand anything. The main language processing challenge of the coming decade is to create robust, accurate, efficient methods that learn to understand the main entities and concepts discussed in any text, and the main claims made. That will enable our systems to answer questions more precisely, to verify and update knowledge bases, and to trace arguments for and against claims throughout the written record. I will argue with examples from our recent research that we need deeper levels of linguistic analysis to do this. But I will also argue that it is possible to do much that is useful even with our very partial understanding of linguistic and computational semantics, by taking (again) advantage of distributional regularities and redundancy in large text collections to learn effective analysis and understanding rules. Thus low-pass semantics: our scientific knowledge is very far from being able to map the full spectrum of meaning, but by combining signals from the whole Web, our systems are learning to read the simplest factual information reliably.
Tel Aviv University
Dana Ron received her Ph.D from the Jerusalem Hebrew University in 1995. Between 1995 and 1997 she was an NSF Postdoc at MIT. During the academic year 1997-8 she was a Bunting science scholar at Radcliffe and MIT. Since 1998 she is a faculty member at the Faculty of Engineering, Tel Aviv University. During the academic year 2003-4 she was a fellow at the Radcliffe Institute for Advanced Study, Harvard University. Her research focuses on Sublinear Approximation algorithms and in particular Property Testing.
When we refer to efficient algorithms, we usually mean polynomial-time algorithms. In particular this is true for graph algorithms, which are considered efficient if they run in time polynomial in the number of vertices and edges of the graph.
However, when considering very large graphs we seek even more efficient algorithms whose running time is sublinear in the size of the input graph.
Such algorithms do not even read the whole entire graph, but rather sample random parts of the graph and compute approximations of various parameters of interest. In this talk I will survey various such algorithms, where the parameters I will discuss are:
(1) The average degree and the number of small stars
(2) The weight of a minimum spanning tree
(3) The size of a minimum vertex cover and a maximum matching
(4) The number of edges that should be added/removed in order to obtain various properties
Yoram Singer is a senior research scientist at Google. From 1999 through 2007 he was an associate professor at the Hebrew University of Jerusalem. From 1995 through 1999 he was a member of the technical staff at AT&T Research. He was the co-chair of the conference on Computational Learning Theory in 2004 and of Neural Information Processing Systems in 2007. He serves as an editor of the Journal of Machine Learning, IEEE Signal Processing Magazine, and IEEE Transactions on Pattern Analysis and Machine Intelligence.
In the talk we review one of the largest machine learning platforms at Google called Sibyl. Sibyl can handle over 100B examples in 100B dimensions so long as each example is very sparse. The recent version of Sibyl fuses Nesterov's accelerated gradient method with parallel boosting. The result is an algorithm that retains the momentum and convergence properties of the accelerated gradient method while taking into account the curvature of the objective function. The algorithm, termed BOOM, is fast to convergence, supports any smooth convex loss function, and is easy to parallelize. We conclude with a few examples of problems at Google that the system handles.
Dr. Aya Soffer is Director of Big Data Analytics Research in IBM. In this role, Dr. Soffer directs the world-wide market. Dr. Soffer leads IBM Research's Massive Scale Analytics Big Bet, an effort to develop new technologies to facilitate making smart decisions from information at very large scales. This is a world-wide effort across seven Research labs focusing on extracting insight from the vast amount of information gathered from heterogeneous sources and make it readily available to decision makers where and when they need it. In the Haifa Research Lab, Dr. Soffer is the Department Group Manager of the Big Data Analytics department that focuses on turning information into assets with emphasis on unstructured information including new forms of social media and networks. Many of these innovations are already included in IBM product and services offerings. Dr. Soffer has published over 30 papers in referred journals and conferences and filed over 15 patents. She has additionally served on program committees and as track chair in many leading conferences.
Enterprises today utilize only a small fraction of the data they have at their disposal. The challenge they face is to extract insight from an immense volume, variety and velocity of data. Companies can for example analyze petabytes of tweets to extract insight on the public opinion of their brand or discover negative sentiment to a new product in almost real time, and adopt their services accordingly. Hospitals can monitor the hundreds of sensors in intensive care units. Analysis of climate and weather data can enable wind turbine and solar panel companies to improve their planning. And police and emergency units can analyze data from security cameras in real time and identify suspicious behaviors. As the amount of data continues to grow, so does the potential of utilizing this data and turning into valuable information for decision makers. In this lecture we will outline several solutions and experiences working with customers in the Big Data space. We will describe the technologies behind these innovations.
Dr. Gadi Solotorevsky is the CTO of cVidya Networks. He is a specialist in Telecommunications and Revenue Intelligence with years of experience developing and deploying Revenue Intelligence solutions and methodologies. Gadi is one of the founders and the chair of the Revenue Assurance modeling team of the TM Forum, and a TM Forum Distinguished Fellow and Ambassador. Gadi holds and PhD in Computer Science.
Harnessing big data analytics for marketing purposes is one of the hottest trends in today’s telecommunications industry that is struggling with commoditization and fierce competition with Over-The-Top (OTT) players such as Skype, Netflix, and WhatsApp. As a telecom revenue analytics software provider to some of world’s largest service providers, including British Telecom, Telefonica Group and O2, Vodafone, AT&T, MTN SA, Deutsche Telekom Group, and others – cVidya’s CTO, Dr. Gadi Solotorevsky, will present cVidya’s achievements in applying innovative technologies for analyzing customers behavior and forecasting. Several use-cases of big data analytics in telecom will be shared to demonstrate the value of analyzing huge amounts of data for optimizing the offerings while protecting the revenue and profit targets. Several real-life examples will be provided on how the operator’s marketing team can better understand key questions such as which OTT apps are crippling other services?, which customers are at risk of leaving?, what is the propensity that a customer will accept a new marketing offer?, what will be the forecasted revenue impact for a new offer in the next 3 months?, how high churn and influencer scores impact the next best action?, how can the operator cross-sell and up-sell content and packages targeting the right customers?, and more.
Naftali Tishby is a professor of Computer Science and the director of the Interdisciplinary Center for Neural Computation (ICNC). He is the holder of Ruth and Stan Flinkman Chair for Brain Research at the Edmond and Lily Safra Center for Brain Science (ELSC). Prof. Tishby is one of the leaders of machine learning research
One of the most intriguing questions in cognitive neuroscience is how our sensation and perception of time is related to the physical (Newtonian) time axis. In this talk I will argue that our sensation of time is scaled non-linearly with the information we have about the relevant past and future. In other words, we scale our internal clock with the number of "bits" of perceptual and actionable information, as determined by our sensory and planning tasks. To this end, I will introduce a Renormalisation Group procedure of the Bellman equation for Partially Observed Markov Decision Processes (POMDP), and argue that such renormalisation (non-linear rescaling of time) can explain the subjective discounting of rewards, and the emergence of hierarchies and reverse hierarchies in perception and planning. Finally, I will argue that the structure of our natural language reflects the "fixed point" of this renormalisation group – namely, the divergence of our planning and perception horizons.
Jeff Ullman is the Stanford W. Ascherman Professor of Engineering (Emeritus) in the Department of Computer Science at Stanford and CEO of Gradiance Corp. He received the B.S. degree from Columbia University in 1963 and the PhD from Princeton in 1966. Prior to his appointment at Stanford in 1979, he was a member of the technical staff of Bell Laboratories from 1966-1969, and on the faculty of Princeton University between 1969 and 1979. From 1990-1994, he was chair of the Stanford Computer Science Department. Ullman was elected to the National Academy of Engineering in 1989, the American Academy of Arts and Sciences in 2012, and has held Guggenheim and Einstein Fellowships. He has received the Sigmod Contributions Award (1996), the ACM Karl V. Karlstrom Outstanding Educator Award (1998), the Knuth Prize (2000), the Sigmod E. F. Codd Innovations award (2006), and the IEEE von Neumann medal (2010). He is the author of 16 books, including texts on database systems, compilers, automata theory, and algorithms. and computational neuroscience in Israel and was recently s elected by Intel to be one of the two PIs of the Intel Collaborative Research Institute for Computational Intelligence (IICRI_CI) in Israel. Tishby was the founding chair of the computer–engineering program, and a director of the Leibnitz research center in computer science, at the Hebrew university. He received his PhD in theoretical physics from the Hebrew university in 1985 and was a research staff member at MIT and Bell Labs from 1985 and 1991. Prof. Tishby was also a visiting professor at Princeton NECI, University of Pennsylvania, UCSB, and IBM research.
His current research is at the interface between computer science, statistical physics, and computational neuroscience. He pioneered various applications of statistical physics and information theory in computational learning theory. More recently, he has been working on the foundations of biological information processing and the connections between dynamics and information. He has introduced with his colleagues new theoretical frameworks for optimal adaptation and efficient information representation in biology, such as the Information Bottleneck method and the Minimum Information principle for neural coding.
After a brief review of how MapReduce works, we shall look at the trade-off that needs to be made when designing MapReduce algorithms for problems that are not embarrassingly parallel. In particular, the less data that one reducer is able to handle, the greater the total amount of data that must be communicated from mappers to reducers. We can view this trade-off as a function that gives the "replication rate" (average number of copies of an input communicated from mappers to reducers) in terms of the "reducer size" (number of inputs that can be accommodated at a reducer). For some interesting problems, including matrix multiplication and finding bit strings at Hamming distance 1, we can get precise lower bounds on this function, and also match the lower bounds with algorithms that achieve the minimum replication rate for a given reducer size.
Elad Yom-Tov is a Senior Researcher at Microsoft Research in Israel. Before joining Microsoft he was with Yahoo Research, IBM Research, and Rafael. Dr. Yom-Tov studied at Tel-Aviv University and the Technion, Israel. He has published two books, over 60 papers (of which 3 were awarded prizes), and filed more than 30 patents (13 of which have been granted so far). His primary research interests are in large-scale Machine Learning, Information Retrieval, and Social Analysis. He is a Senior Member of IEEE and held the title of Master Inventor while at IBM.
Abstract: Learning About Medicine by Applying Machine Learning to User Generated Content: The Case of Anorexia
Collecting medical information from large populations is expensive and difficult, especially when dealing with sensitive topics. In this talk, I will show that specific types of User Generated Content (UGC) contain a wealth of health and medical information which can be gleaned by applying Machine Learning methods to these data. As a case in point, I will focus on two questions related to anorexia: First, I will show the measurable effect of the media's portrayal of thinness on anorexic search behavior. Second, I’ll demonstrate the detrimental consequences of well-intended interventions through a photo-sharing site.