Transcript Slide 1

Mining Accumulated Crop Cultivation
Problems and their Solutions
Shivendra Tiwari
Arvind Kumar Mahla
Outline
•
•
•
•
•
•
Introduction
Motivation
Related Work
Objectives
Challenges & Problem Definition
Proposed Solution - Template Based Association Rules
Mining
• Case Study: VERCON 2006
• Discussion and Questions
Introduction
Fast access of the solution to various problems can greatly influence the
agricultural productivity. The farmers in the developing countries don’t have
expertise and are dependent on the expert’s advices.
Farmers Problems Database
(VERCON subsystem in Egypt – Agricultural problem database)
•A web interface for users to input problem (meta-data descriptors and free
text description)
•Problem forwarded to researcher
•Solution in free text from researcher
•Problem and its solution stored in textual database
Usage:
•Search for similar problem and solution
•Post as a new problem
Motivation
• The Agricultural problems database grew significantly (10000+)
over a period of five years.
• Locating similar problems became difficult with increasing size
leading to redundancy
• The queries and the solutions are unstructured
• The problems and corresponding solutions can be extensively
used by the Decision makers, Researchers, and Farmers
Objectives
• Addition & Insertion of new Problems - Avoid Duplicate
Problems Insertion.
• Validation & Modification of the solutions by the domain
experts.
• Accessing the existing solutions efficiently and accurately.
• Inconsistency resolution in the problems and
corresponding solutions.
• Removal of the outdated material from the DB.
• Problem resolution without domain expert’s help on the
basis of the past pattern.
• Decision/Policy Making using the Patterns and relations.
• Problems predictions
Challenges & Problem Definition
• Plain Text: The problems and solutions are stored in the plain text format.
• Information Extraction: to convert the plain text to the structured data first.
• Problem Classification (i.e. weeds, diseases, pests, fertilization and
irrigation)
• Identification of the Complaint Object – the farmers even don’t know what
problem is it? They just enter the symptoms.
• Feature description of the Complaint Object.
• Text data variety:
– Discovery of similar complaint written in different styles.
– Single complaint may contain one or more primary complaints.
– Complaints can look similar, but they are actually different.
• Data representation
– Structured Problem/Query Formulation
– Structured Solution Formulation
• Extraction Algorithm
– Summarize and Analyze Information
Related Work
• Opinion Mining: Used to assist customers in product review
before purchase.
– Display: bright, dark, clear etc (for a mobile phone)
– Look: stylish, traditional, moderate etc.
– Weight: heavy, slim, light etc
• Association Rules: Extracting Product Feature from English
Product Reviews.
• Opinion Observer: observe the advantages and
disadvantages of a product by collecting positive and negative
words in the review.
• Ontology Usage: use of ontology to discover the problem
object, extracting key words and sentences etc.
Sample Problems
• There are spots on the leaves and on the spikes
which have a cotton like texture and which turn to
grey in some areas within the planted 25 feddan
land.
{color=gray, texture=cotton like}
• There are white, non-uniform spots with cotton
like texture on the lower surface of plant leaves.
{plant=“”, color=white, texture=cotton like,
location=lower surface, distribution=non-uniform}
Template Based Association Rules Mining
• Template Based Data Storage
– Named Entries
– Timed Based Entries
– Number Based Entries
– Percentage and Rates
• Data Representation (Predefined Template) - MultiFaceted Object Extraction Methodology
– Structured Problem Formulation
– Structured Solution Formulation
• Information Extraction Algorithm
– Summarize and Analyze Information (association
rules mining)
Template Based Association Rules Mining – cont…
• Metadata is used to classify problems &
extract attributes.
• The complaint text is scanned word by
word.
• Ontology is used
agricultural objects
features.
to identify the
and associated
• The word found is marked as identified
and location is stored.
• A template is used to store a problem
and the solutions.
• Main Object of Complain (MOC) and
Main Object of Solution (MOS) is
extracted finally.
• One Complaint Object contains both the
MOC and the MOS.
Association Rule Mining
Given: Item sets, minimum support & confidence levels
Output: Association rules (A -> B, A -> C, B -> C)
Algorithm: Apriori or its variants
• The algorithm finds out frequent item sets containing 1 to
many items.
• Based on these frequent item sets association rules are
formulated.
• A rule B -> C holds with confidence level c if at least c% of
records which contain B also contain C
• A rule B -> C has support s in the dataset if at least s% of
records contain B U C
Case Study - VERCON
(Virtual Extension and Research Communication Network)
• VERCON is a conceptual model of the Food and Agricultural
Organization (FAO) of United Nations.
• It has been adopted by over 7 countries (i.e. Govt. of Bhutan,
Govt. of Egypt etc).
• It is used to improve and establish a national agricultural
knowledge
• Aims and Challenges:
– Strong linkage between research and field implementation.
– Easy access of agricultural information.
– Connecting geographically dispersed people and enhance two-way
communication.
– Rapid data collection, processing, disseminating information and
managing large volumes of data are the key challenges.
Case Study:
VERCON – objectives
VERCON – globally shared information
VERCON – how does it work, a new problem?
1. Problem reporting
a. Cabbage crop has infected with a new leaf disease
b. A leave is sent to location extension
2. Disease identification at local extension office
a. The database contains locally taken snaps and
internationally compiled images of the leaves.
b. Extentionist matches the image, but could not
recognize the disease.
3. Contact the specialist at nearest research
station
a. Post online enquiry with photos of the infected
leaves.
b. Also send the sample leaves by postal mail.
c. The email message is copied to other extension and
research stations to alert them to the new ‘problem’.
VERCON – how does it work?
1. Problem Diagnosis at Specialist level
a. The specialist discusses this ‘problem’ with other
colleagues if this is a new problem.
b. contacts the extensionist via email to find out
more information such as what crops are
growing nearby.
2. Diagnosis at Researchers End
a. The researcher confirms the diagnosis as a
fungus previously found only in another part of
the country.
b. Suggest the suitable disease management
practices in the context of the farmer’s situation.
3. Publish the factsheet of the new problem to every
stakeholders and the extensions.
4. Extension services communicate new problem and
corresponding solutions to the farmers.
Thank You
Questions & Discussion