Look Over Here: Attention-Directing Composition of Manga Elements

Download Report

Transcript Look Over Here: Attention-Directing Composition of Manga Elements

Look Over Here:
Attention-Directing
Composition of Manga
Elements
Ying Cao
Rynson W.H. Lau
Antoni B. Chan
SIGGRAPH 2014
1
Outline
•
•
•
•
•
•
•
•
Introduction
Overview
Data Acquisition and Preprocessing
Probabilistic Graphical Model
Learning
Interactive Composition Synthesis
Evaluation and Results
Discussion
2
Introduction
• Goal
1
1.Rabbit, I came here for gold,
2. and I'm gonna get it!
3. I gotcha, you rabbit! I'll show
you!
2
You can't do this to me!
3
4
5
Talk
Close-up
Fast
Medium
Medium
Long
Medium
Eureka! Gold at last!
Close-up
Medium
Eureka! Gold at last!
Big Close-up
Medium
3
Introduction
• The especially composition of manga elements .
subjects ( ) and balloons( )
• Manga artist guides viewer’s eyes through the page via subject
and balloon placement.
• The path guiding the readers through the artworks
the underlying artist’s guiding path (AGP)
• The viewer’s eye-gaze path through the page
the actual viewer attention
4
Introduction
• We introduce a novel probabilistic graphical model for
subject-balloon composition.
• Based on this model, we propose an approach for placing a set
of subjects and their balloons on a page.
• In response to high-level user specification, and evaluate its
effectiveness through a series of visual perception studies.
5
Overview
Annotation
Data
Eye-tracking Data
Generate
Input Storyboard
Layout
Input
Learn
Artist’s Guiding Path
𝐆
Infer
Resulting composition
Viewer Attention
𝐂
𝐀
Composition
Probabilistic Graphical Model
6
Data Acquisition and
Preprocessing
• To train our probabilistic model, we have collected a data set
comprising 80 manga pages from three different series.
Shot type→
Motion state→
↓Subject
Balloons→
Annotation
Eye movements of viewers
7
Probabilistic Graphical Model
• We propose a novel probabilistic graphical model to
hierarchically connect artist’s guiding path, composition and
viewer attention in a probabilistic network.
Viewer Attention
Artist’s Guiding Path
𝐆
𝐂
𝐀
Composition
Probabilistic Graphical Model
• Abstracts the artist’s guiding path (AGP) as a latent variable in
our model.
8
Probabilistic Graphical Model
• Our proposed model consists of 6 components, representing
different factors that influence the placement of elements on
the page.
9
(1)-Model Components and
Variables
• In our model, the
• Each panel has
page consists of a set of
subjects, each of which has
panels.
balloons.
10
(1)-Model Components and
Variables
• Artist’s Guiding Path(AGP)
Underlying AGP (f(t)) and actual AGP (I(t)) are represented as
smooth splines over the page.
Uniformly samples
control points along the curve length,
𝐈: actual AGP
𝐟: underlying AGP
11
(1)-Model Components and
Variables
• Panel Properties and Local Composition Model
We consider both semantic (i.e., shot type and motion state)
and geometric (i.e., rough shape) properties of the panels.
𝒈 ∈ {geometric style 1 = 1, geometric style 2 = 2, geometric style 3 = 3}
𝒕 ∈ {long = 1, medium = 2, close-up = 3, big close-up = 4}
𝒎 ∈ {slow = 1, medium = 2, fast = 3}
12
(1)-Model Components and
Variables
• Panel Properties and Local Composition Model
We define
as the possible subject locations and
sizes according to the local composition in the panel
13
(1)-Model Components and
Variables
• Subject Placement
The actual placement
of a subject is a mixture of its local
position and an associated point
on the global AGP.
We denote the subject’s location and size as
.
14
(1)-Model Components and
Variables
• Balloon Placement
The placement of a balloon depends on its subject’s
configuration
, its size , and reader order ,
as well as an associated point
on the AGP.
 We denote the balloon’s position and size as
.
15
(1)-Model Components and
Variables
• Viewer Attention Transitions
For each panel, we define a set of binary variables
,
where
indicates that there is a viewer transition between
elements and .
16
(1)-Model Components and
Variables
• Complete model by putting the six model components
together.
17
(2)- Probability Distributions
• Each random variable in our model is associated with a
conditional probability distribution (CPD),
, which
represents the probability of observing
given its parents
.
• We next describe the CPDs used for each variable in our
model.
18
(2)- Probability Distributions
• Artist’s Guiding Path (f, I).
The two coordinate components of the curve are modeled as
two independent Gaussian processes,
-
,
: the squared exponential covariance functions
The actual AGP I is a noisy version of the underlying AGP f,
denotes a multivariate Gaussian distribution of x, with
mean µ and covariance Σ.
19
(2)- Probability Distributions
• Panel Properties (P).
The shot type t, motion state m and geometric style g are all
discrete random variables with categorical distributions,
• Local Composition (𝒙𝑳 , 𝒓𝑳 ).
To describe the complexities of local foreground placement 𝑥 𝐿 ,
we use a Gaussian mixture model (GMM),
The local subject size 𝑟 𝐿 is Gaussian,
.
20
(2)- Probability Distributions
• Subjects and Balloons (S, B).
Let
be the continuous parent variables of 𝑥 𝑆 . For
the subject S, we have
Similarly, let
be the continuous parent variables
of 𝑥 𝐵 . For the balloon B, we have
- For subject size 𝑟 𝑆 , we define
,
with ω and σ2 being weight parameter and variance.
21
(2)- Probability Distributions
• Viewer Attention Transitions (U = {𝑼𝒊𝒋 }).
Let 𝑂𝑖𝑗 be a set of parent random variables of 𝑈𝑖𝑗 . We define
the CPD of 𝑈𝑖𝑗 as
- We define
.
The potential function is a linear combination of two terms,
22
Learning
• The goal of the offline learning stage is to estimate the
parameters θ in the CPDs of all random variables in the
probabilistic model, from the training set D.
expectation-maximization (EM) algorithm [Bishop 2006]
23
BISHOP, C. 2006. Pattern Recognition and Machine Learning. Springer.
Interactive Composition
Synthesis
• Generate a composition, subject to user-specified semantics
1.Rabbit, I came here for gold,
Input:
2. and I'm gonna get it!
3. I gotcha, you rabbit! I'll
show you!
subject & script
Close-up
Fast
shot type &
motion state
Talk
inter-subject constraint
• Layout Generation + Composition Synthesis
24
(1)-Layout Generation
• We use a simple search algorithm to retrieve the best-fitting
layout from our database of labeled pages.
for i-th panel of the input and layout candidate
- 𝑡𝑖 ∈ 1,2,3,4 : shot type
- 𝑚𝑖 ∈ {1,2,3} : motion state
- 𝑛𝑖 : the number of elements
25
(2)-Composition via MAP
Inference
Input elements & semantics + Layout 𝐗 𝐸
Constraints 𝐘𝐶
Configurations of elements
𝐗𝑈
• The objective of MAP(Maximum A Posteriori) is to find a
solution to 𝐗 𝑈 that maximizes the posterior probability,
26
(2)-Composition via MAP
Inference
Constraint-based Likelihood.
-where {ρi} are weights controlling importance of different
terms.
-Our implementation uses ρ1 = ρ2 = 0.3, ρ3 = ρ4 = 0.2.
27
(2)-Composition via MAP
Inference
Constraint-based Likelihood.
 𝐶𝑜𝑣𝑒𝑟𝑙𝑎𝑝 : overlap term
 𝐶𝑜𝑟𝑑𝑒𝑟 : order term
28
(2)-Composition via MAP
Inference
Constraint-based Likelihood.
 𝐶𝑏𝑜𝑢𝑛𝑑 : boundary term
 𝐶𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 = 𝐶size + 𝐶interact : subject relation term
𝑟𝑖
𝑟𝑗
𝐯ij
29
Evaluation and Results
30
(1)-Comparison to Heuristic
Method
• Visual Perception Study.
The goal of the visual perception study is to investigate if the
participants have a strong preference for our results over
those produced by the heuristic methodt[Chun et al. 2006].
31
CHUN, B., RYU, D., HWANG, W., AND CHO, H. 2006. An automated procedure for word balloon placement
in cinema comics. LNCS 4292, 576–585..
(1)-Comparison to Heuristic
Method
• Visual Perception Study.
32
(1)-Comparison to Heuristic
Method
• Eye-tracking experiment and analysis.
We measure the consistency in both unordered and ordered
eye fixations across different viewers.
Inlier percent [Judd et al. 2009]
Root Mean Squared Distance (RMSD)
Viewer B
RMSD
Viewer A
Viewer A
Saliency Map
,
Viewer B
Classification
Inliers
JUDD, T., EHINGER, K., DURAND, F., AND TORRALBA, A. 2009. Learning to predict where humans look. In ICCV’09.
33
(1)-Comparison to Heuristic
Method
• Eye-tracking experiment and analysis.
 Shows example compositions with eye-tracking data.
34
(2)-Comparison to Manual
Method
Participant preference voting
Time for one composition
140
45
35
(3)-Comparison to Existing
Manga Pages
36
(4)-Recovering Artist’s Guiding
Path
37
(5)-Limitations
• Our work has two limitations.
1. Our work assumes that the variations in spatial location and
scale of elements are the only factors driving viewer
attention.
2. For the panel with more than four subjects, our approach
can fail to produce satisfying results automatically.
38
Discussion
• We have proposed a probabilistic graphical model for
representing dependency among the artist’s guiding path,
composition and viewer attention.
• We show that compositions from our approach are more
visually appealing and provide a smoother reading experience,
as compared to those by a heuristic method.
• Enable easy and quick creation of attention-directing
compositions.
• Extend to other graphic design tasks.
39
References
• manga pic http://goo.gl/O2HNXb
40