Arshavir Blackwell

Projects

2008 - 2010. Senior Scientist, Fox Audience Network, Santa Monica, CA

Demographic prediction engine. Demographic statistics such as age and gender are key to refining ad targeting. Often, these values are missing for individual users. We analyze profiles of users with known demographic values and model them, in order to predict values for other users. Our goal is to extrapolate from users with known features and predict (e.g.) age and gender in users where those features are missing.This project requires a full understanding of the relevant algorithms (e.g., Support Vector Machines) as well as the ability to implement them in Java and to perform large-scale (5 terabyte+) data analyses in Hadoop. Code to pass production-level quality testing.

Buzz tracking. Knowing which trends are important over time—"buzz"—is key to a variety of marketing initiatives, including ad targeting. Advertisers want to see evidence of lift in terms related to products or services that they are advertising. They may also be interested in what terms related to a product are currently trending high, in order to link those terms to their ads. This system is a response to that need. It uses trend and rate analysis algorithms to track changes in frequency and intensity of targeted buzz words. This project requires a full understanding of the relevant algorithms (e.g., the bursty stream algorithm) as well as the ability to implement it in Java and to perform large-scale (5 terabyte+) data analyses in Hadoop. Code to pass production-level quality testing.

Advise music recommender student internship. Users seek new sources of entertainment on-line, including music and movies. This project addressed this need by creating a prototype music recommender system that leverages the advantages of social media. I directed Harvey Mudd College interns in a nine-month-long project to design and deploy a music recommender system using Java, C++, and JavaScript, the details of which they designed and wrote. This was designed to work as a MySpace Widget plug-in, using social data to mine a user's social network and develop a cohort of experts for a particular musical genre within the user's social sphere. This was an innovative approach, and contrasts with other recommender systems that only compare items a user likes to similar items, or groups users based upon shared preferences, without regard to actual social links.

Intent Miner. Helped to develop and test system to extract intents from unstructured text (e.g., "intent to purchase car," "intent to purchase cellphone," "just married," "just had a child") in order to classify users into hyper-targeting groups (e.g., more likely to be interested in home decor or baby products). This project required an understanding of natural language algorithms, Java, and JUnit testing.

2007. Principal Computer Scientist, MetaLINCS, San Jose, CA

Innovation team. Support innovations and improvements to MetaLINCS flagship e-discovery application. This requires a) a complete understanding of both the current product and its embedded algorithms as well as other algorithms that might be of potential benefit; b) expert skill in Java; and c) expertise in a wide variety of natural language processing algorithms.

2005 - 2007. Senior Scientist/Director of Research, H5 Technologies, San Francisco, CA

Lead research and development. Research and develop improvements to business processes in order to increase accuracy and speed and lower cost. This requires a complete understanding of research and analytical methodologies needed to evaluate the performance of the business processes, especially as they relate to large scale document analysis.

Build software tools. Using Java and Java Server Faces (JSF), act as part of the team to architect, develop, and test new tools, particularly search. These tools support the companyâ??s core mission, which is to analyze very large (on the order of millions) document sets, in order to identify documents relevant to a particular legal case.

2003 - 2005. Senior Engineer & Project Lead, Entrieva, Reston, VA.

Unstructured document management applications. Using C++ and Java, acted as lead to maintain and upgrade current categorization software central to solutions provided by Entrieva. Architected new solutions to augment product portfolio in order to expand the companyâ??s services and increase its competitiveness. This required a complete understanding of the companyâ??s proprietary language processing algorithms.

2001 - 2003. Principal, adaptiveLava, Oakland, CA.

Peer-to-peer artificial intelligence. Chief architect of application in ongoing project to merge peer-to-peer functionality with artificial intelligence/natural language systems in an enterprise environment, based on open source code. By utilizing search and retrieval algorithms, the application makes previously inaccessible files on individual PCs accessible and available to many users within an enterprise or knowledge community, rather than only files specifically pushed to servers. This enables the enterprise or community to leverage existing intellectual property assets to a level not before possible.

2001. Principal Scientist, Comprecorp, Nevada City, CA.

Classifier project. Lead on project to design and implement engine of the Comprecorp Classifier, intended to classify e-mail and other such free-text documents of arbitrary length, according to user-specified categories. The user does not have to tell the system what rules are used to put a document in a particular category. It learns by example from looking at documents already in categories. Potential uses of such a system go beyond e-mail classification, to a wide variety large-scale document management and data mining applications.

1999 - 2000. Senior Engineer, Ask Jeeves, Emeryville, CA.

Jeeves automation project. Lead on project to improve accuracy and lower cost for Jeeves question-answering system. This required expert programming skills (the product used C++ applications delivered through a web interface using Microsoft ASP) and a thorough understanding of research and analytical methodologies to evaluate system performance. In the original system, creating and maintaining a knowledge base of questions and answers was too labor intensive, and too costly, to encourage the use of the product by smaller businesses, particularly in the face of increasing competition from other question-answering products.

As project lead, I identified bottlenecks in the creation of knowledge bases that were amenable to adaptive automation, created design specifications, and lead a team in writing code to implement those changes. I presented both method and results to company members through meetings and on-line publications, and worked with in-house customers to refine the prototypeâ??s usability. The result was comparable prototype knowledge bases whose creation and maintenance required significantly less human effort and cost less.