Refactoring HTML - Cafe con Leche XML News and Resources

Download Report

Transcript Refactoring HTML - Cafe con Leche XML News and Resources

Refactoring HTML
Elliotte Rusty Harold
[email protected]
http://www.cafeconleche.org/
Why Refactor
What to Refactor To
XHTML
 CSS
 REST

Move Away From
Tag soup
 Presentation based markup
 Stateful applications

XHTML
CSS
REST

All resources are identified by URLs.
 Safe, side-effect free operations such as
querying or browsing operate via GET.
 Non-safe operations operate via POST.
 Each request is independent of all others.
Tools
The Refactoring Process
1.
2.
3.
4.
5.
Identify the problem.
Fix the problem.
Verify that the problem has been fixed
Check that no new problems have been
introduced.
Deploy the solution.
Things Can Go Wrong
Backups
 Staging Servers
 Source Code Control

Validators
W3C Markup Validation Service
 LogValidator
 Xmllint
 Editors: DreamWeaver, BBEdit, etc.

Testing
HTMLUnit
 JsUnit
 HTTPUnit
 jWebUnit
 Fitnesse
 Selenium

Regular Expressions
Learn them!
 But be cautious
 Prefer parser-based
solutions

Tidy
C (and PHP)
 Custom API
 Can handle most bad markup
 Usually produces well-formed
XHTML
 Often produces valid XHTML
 $ tidy -asxhtml -m index.html

TagSoup
Java and SAX
 Can Handle Anything
 Always well-formed
 May not be valid
 $ java -jar tagsoup.jar -encoding=ISO-8859-1 index.html

Well-formedness Defined
Every element has one parent
elemnet; no overlap
 Every start-tag has a case-sensitive
matching end-tag
 Attribute values are quoted
 Entity references are defined
 +Namespaces

Well-formedness Refactorings









Make name lower case
Quote attribute value
Replace empty tag with empty-element tag
Add end-tag
Eliminate overlap
Convert text to UTF-8
Escape < and &
Introduce an XHTML DOCTYPE
Introduce the XHTML namespace
Validity Defined

The document has a DOCTYPE
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1transitional.dtd">
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"/dtds/xhtml1-transitional.dtd">
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

The document adheres to constraints
expressed in the DTD
Validity Defined

The document has a DOCTYPE
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"/dtds/xhtml1-transitional.dtd">
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

Document adheres to constraints expressed in the DTD
Validity Refactorings
Introduce Transitional DOCTYPE
 Introduce Strict DOCTYPE

Transitional
Eliminate bogons
 Add alt attributes

Srict
Replace center, b, i, font, etc. with
CSS
 Nest inline elements in block
elements

Layout








Wrap related information in divs
Add ID attributes
Replace table layouts with CSS
Replace frames with CSS positions
Put the content first
Markup lists as lists
Replace blockquote/ul indentation with
CSS
Replace spacer GIFs
Accessibility








Convert images to text
Add labels to forms
Standard names for input fields
Add tab indexes to forms
Add skip navigation
Add internal headings
Provide captions, summaries, and headers
for tables
Identify acronyms
Web Applications








Replace GET with POST
Replace POST with GET
Replace Flash with HTML
Make web apps cache savvy
Provide Etags
Add Web Forms 2.0 Types
Block robots
Avoid SQL injection
Content
Check spelling
 Check links
 Restructure sites but keep the URLs
 Remove entry pages
 Hide e-mail addresses from
spambots

Objections To Refactoring

We don’t have the time to waste on
cleaning up the code. We have to get
this feature implemented now!

Refactoring saves time in the long
run.

You have more time than you think
you do.
Further Reading
Refactoring HTML: Elliotte Rusty
Harold
 Refactoring: Martin Fowler
 Designing with Web
Standards:Jeffrey Zeldman
 The Zen of CSS Design: Dave Shea &
Molly Holzchlag
