Yenta: a simple recommendation language

Download Report

Transcript Yenta: a simple recommendation language

Yenta
A Simple Recommendation Language
Kenny Rivera - Tester
Becky Tang – System Architect
Shylah Weber – Systems Integrator
Anthony Yim – Project Manager
Definition
Yenta
1) Yiddish word which describes an old
woman who is a matchmaker - a Yentle. (Ref.
Fiddler on the Roof)
2) In modern use the meaning has become
that of an annoying old hag.
(http://www.urbandictionary.com)
Motivation
Recommendation engines are becoming
increasingly popular
We wanted to make recommendations:
• Fast
• Simple to program
• Flexible in content
Overview
Imperative language
Yenta allows recommendations based on either of
the following input:
• A “seed” or example piece of data
• A set of criteria or “tags”
Title
Artist
Album
Year
Genre
Length
Wrapped
George Strait It Just Comes Natural
2006
Country
4:10
You Got Lucky Tom Petty
Greatest Hits
2000
Rock
3:36
MMMBop
Middle of Nowhere
1997
Pop
4:29
Hanson
Syntactic Constructs
• Sample Yenta program:
file myMusicFile = import("music.txt");
myMusicFile.applyScore("Year", ”2006", "0.8", "Genre", "Pop", "0.2");
myMusicFile.printTop(3);
• Built-in subroutines (3 main steps)
– import(string fileName)
– applyBasicScore(string tag, string tagValue, double weight, . . . )
– printTop(int n)
• Java-like declarations of primitive types
–
file
type
Equivalent Java Code for Import()
public void importFile(String filename)throws IOException{
BufferedReader in = new BufferedReader(new FileReader(filename));
BufferedReader in2 = new BufferedReader(new FileReader(filename));
}
String line = in.readLine();
String lineCopy = line;
rows = 0;
while (line != null){
rows++;
line = in.readLine();
}
StringTokenizer st = new StringTokenizer(lineCopy, ”\t");
cols = 0;
while(st.hasMoreTokens()){
st.nextToken();
cols++;
}
tags = new Tag[cols];
line = in2.readLine();
StringTokenizer st3 = new StringTokenizer(line, ”\t");
for(int k = 0; (k < cols) && st3.hasMoreTokens(); k++){
tags[k] = new Tag(st3.nextToken());
tags[k].col = k;
}
rows--;
file = new String[rows][cols];
line = in2.readLine();
int i = 0;
while (line != null){
StringTokenizer st2 = new StringTokenizer(line, ”\t");
int j = 0;
while(st2.hasMoreTokens()){
file[i][j] = st2.nextToken();
j++;
}
i++;
line = in2.readLine();
}
scores = new double[rows];
Yikes!
Other Syntactic Constructs
• Flow control
– for loops
– if statements
• Other types
– int
– double
– string (seen)
– scoreFunc
Translator Architecture
Yenta Program
Main
Main.java
Java compiler
Java interpreter
Java file representing
the Yenta program
Java compiler
Java program
Java interpreter
Yenta Output
Translator Architecture
YentaProg.txt
Main
Main.java
Java compiler
Java interpreter
YentaProg.java
javac Main.java
java Main < YentaProg.txt
javac YentaProg.java
java YentaProg
Java compiler
YentaProg
Java interpreter
Yenta Output
Interpreter Implementation
• Wrote Java methods that are always in the output file
• Our grammar simply calls
these methods with the
programmer’s parameters
Development Environment
+
+
+
Tools
ANTLR v3
• Parser + Lexer
• Takes a LL(*) grammar as input.
• “Spits out” tokens as output.
Example ANTLR grammar
Tools
ANTLRWorks parse tree
Tools
StringTemplate
example StringTemplate
Outputs…
Equiv. Java code that
was “printed” out.
The Proposed Test Suite
• Unit testing
– Grammar was constructed based on several
programs designed to exploit key features
– Advantages:
• Provides strict guidelines that code must satisfy.
• Unit testing finds problems early in development
• Fuzz testing
– Provides random data to our example programs to
test for errors and inconsistencies
Advantages
Yenta is a simple, easy-to-use language that is:
• Easy to learn
• Fast to write
• Concise, easy-to-read code
• Can be used by beginners but also augmented by
more advanced features
Advantages
Yenta’s primary
advantage is that it can
be used to recommend
anything a programmer
wishes
(Like Columbia classes!)