This slideshow could not be started. Try refreshing the page or viewing it in another browser.
This Code Review Talk is Excellent but…
“[P]eer code reviews are the single biggest thing you can do to improve your code.”
Code designed by Nina Geometrieva from the Noun Project
About me: Mohammad (Mo) Jangda
– Toronto, Canada
– Code Wrangler at Automattic (WordPress.com)
– Ice Cream Fan
Who, in the audience, has some code review process in their organization or workflow?
Outline
– My history with Code Review
– Why bother?
– How to implement?
– Golden Rules.
My History with Code Review
Early in my career (school, various development jobs, freelance work, open source projects), I always felt like I was missing something.
Saw the Light
First* week on the job:
“So, let’s get you started by having you review this code sent in by a client!”
— Team Lead
Thrown right in and ~50% of my job for 3 years.
* after completing Automattic’s standard support rotation
I learned three things.
Lesson #1
Reviewing someone’s code is really scary!
Lesson #2
Getting your code reviewed is really scary!
Lesson #3
Extremely rewarding!
[P]eer code reviews are the single biggest thing you can do to improve your code. If you’re not doing code reviews right now with another developer, you’re missing a lot of bugs in your code and cheating yourself out of some key professional development opportunities. As far as I’m concerned, my code isn’t done until I’ve gone over it with a fellow developer.
— Jeff Atwood
— http://codinghorror.com
Common Excuses
– We pay our developers a lot of money. They should be writing perfect code. The need to review code means it’s not perfect. Imperfect code means we’re not getting our money’s worth.
– We don’t have time.
– It’s not my feature!
“the average [defect detection] effectiveness [rate] of design and code inspections are 55 and 60 percent.”
— Steve McConnell (Book: “Code Complete”)
Compared with 25-45% for various forms of testing.
(via Jeff Atwood http://blog.codinghorror.com/code-reviews-just-do-it/)
Development projects with code review have significantly fewer bugs, reduced development costs, increased productivity, and early ship dates.
(via Jeff Atwood http://blog.codinghorror.com/code-reviews-just-do-it/)
Warning: code review can be damaging if not done carefully.
Effective Code Review is a part of the development process and the culture of the organization, not just an afterthought.
Effective Code Review needs buy-in from everyone involved.
We always get push-back from developers working on the VIP platform.
Effective Code Review sets goals
Goals: Take Your Pick
– Learning other perspectives
– Becoming a better developer
– Becoming a better communicator
Goals: Take Your Pick
– Better Code Quality!
– Have at least one other person look at the code.
– Have fun!
Have established rules, standards, and process
Even if it’s no rules, standards, and processes.
Should define: what is the scope of the review? What types of things are we looking for? How will the process works. And communicate that with the team
What does a code review entail?
Looking over code to find potential issues or areas for improvement.
Bugs | Best Practices | Design Patterns | Faulty Logic | Confusing Flows | Opportunities for code reuse | Lack of standards/style | etc.
Culture, environment and workflow will change how you adopt it.
Individuals: DIY
– Do what (some) professional writers do!
– Write/Code in the morning => lunch => Edit/Review in the afternoon
– Freewriting?
Individuals: Find a buddy
– Open Source? Find volunteers at meetups/local community/IRC/Twitter.
– Closed source? Find another freelancer and trade.
Individuals: Offer reviews
Review open pull requests on GitHub (send messages to the PR author if you don’t want to ask or discuss publicly).
(May want to be familiar with topic or code in some form before jumping in otherwise start privately.)
Individuals: Pay someone
Airpair / Helpouts / etc.
Teams: Who?
– Gatekeeper
– Peer Review
– Committee Review
– Pair Programming
– By-request
Junior vs Senior developers?
Teams: When?
– Pre Code Review
– Iterative Review
– Pre Commit Review (for centralized source control like SVN)
– Post Commit Review
– Post Deploy Review
Teams: tools?
– Specific code review tools: Crucible, Phabricator.
– Source Control: Github, Trac Code Comments, patches.
– In-person.
– Ad hoc: Email/IM/IRC/Snapchat?
– Review as Pull Request / bugfix (https://github.com/Automattic/site-logo/pull/14)
Tools: Don’t need to get fancy.
Reviewing a simple patch/diff can go a long way.
Code Review at Automattic
Varies by team:
– no code review
– team lead or peer review
– gitflow (someone else has to review and merge)
– Code Review P2
Code Reviews are “egoless”
It’s not a competition about how many mistakes you can find. Or who is a better developer.
Nope.
I’ve been programming for n years. I don’t need someone else to tell me what’s wrong with my code.
I’ve been programming for n years. I can point out all the things wrong with someone’s code without ever looking at it.
Don’t make it personal
Reviewers: The fault doesn’t lie in the developer; it lies in the code.
Reviewees: don’t get defensive about feedback you receive about the code. Fix your mistakes and learn from them.
Remember the goals of the code review: learning and making the code better.
Caveat: If the mistakes are continually repeated in future code, then it’s a problem.
Reviewees: you can only fool your reviewer once. Accept your mistakes, but work to prevent them in the future.
Communication is everything!
How you convey your feedback and discuss the code will help ensure an egoless and effective review
Communication Tip #1
No personal pronouns in feedback.
Bad
path-to-file.php@8237#L786
Why are you not sanitizing your GET value here? You should go through your code again and properly sanitize all instances where you’re interacting with remote data.
Translation
path-to-file.php@8237#L786
You! Yes, You! I’m talking to you! You suck!
Good
path-to-file.php@8237#L786
We should sanitize the GET value here per best practices. Any reason not to? There may other instances where we should take a closer look as well.
Bonus Protip
Wherever you’re inclined to use a pronoun, replace it with “the code” or a coding concept (variable/class/etc.)
“You should be…” => “The code should be…”
Communication Tip #2
Avoid the use of “but”; prefer “and” or no conjunction instead to maintain an air of positivity.
The overall logic is great but the function could use some refactoring.
vs.
The overall logic is great. The function could use some refactoring, though.
Don’t Make Assumptions
Reviewees: optimize for code review; write your best code because you know someone will critique it (i.e. so make it good)
Reviewers: Be critical; ask questions about decisions that were made.
Safe space to ask questions
– Reviewers are okay to ask about code they do not understand. Opinions are okay as well, although, better phrased as question as a means of discussion.
– Reviewees can ask questions ahead of time about things they are unsure of. This can be provide a good starting point for the review.
Avoid Pedantics
Don’t get too caught up in minor details but make sure development best practices and coding standards are being followed.
Automate code style/standards issues, if possible.
Avoid too much emphasis on small patches vs big ones.
Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.
— Giray Özil (@girayozil) February 27, 2013
Not a Silver Bullet
Code review is not going to catch everything. Bugs are inevitable.
Think Positive!
Reviewers: Remember to praise good code and incremental improvements.
Reviewees: Remember to say thank you!
Have fun!
Jokes + GIFs + Emoticons
In Closing
Takes a while to learn the rules. You will forget to follow them. You will fall out of the code review habit. You will hate yourself and your coworkers and peers. But you will become better.
And it will pay off.
Want a cool job?
Automattic is hiring code reviewers, developers (PHP/javascript/node/go/whatever), Happiness Engineers, and more!
Work from anywhere! Unlimited vacations! Cool swag!
Talk to me or http://automattic.com/jobs/
Thanks!
Presentation: WordPress in the Newsroom
This slideshow could not be started. Try refreshing the page or viewing it in another browser.
WordPress in the Newsroom
About me: Mohammad (Mo) Jangda
- Code Wrangler at WordPress.com (Automattic)
- Core Contributor to WordPress
- Background in News, Publishing, Tech
- Ice Cream Fan
~22.5%
Percentage of top 10 million sites on the web that use WordPress
(up from ~18% last year)
(~29% of all new websites)
Some of the biggest names in media, business, and government
New York Post
TIME.com
Quartz
Re/code
It’s hip to be WordPress
But do you really know and use WordPress to its potential?
Most Important: Stay Secure
– Keep your WordPress updated
– Use a strong password (or a Password Manager)
– User two-factor authentication
– Make sure your team does the same!
Structured Data
WordPress is a semantic publishing platform.
Custom Post Types
A custom post type is nothing more than a regular post with a different
MARKDOWN_HASH01b0357bbb461420eb0aced7e3c2fcb9MARKDOWN_HASH
value in the database. The post type of regular posts is post, pages use page, attachments use attachment and so on. You can now create your own to indicate the type of content created. You could create custom post types for books, movies, reviews, products, and so on.
— Smashing Magazine (http://wp.smashingmagazine.com)
Custom Taxonomies
WordPress’ custom taxonomies make it possible to structure large amounts of content in a logical, well-organized way. In WordPress, categories are set up as a hierarchal taxonomy, and tags are set up as a multifaceted taxonomy […] A large news organization could organize its content by world region (Africa, Asia, Europe, Latin America, Middle East, US & Canada), as the BBC does in its “World” section.
— Smashing Magazine (http://wp.smashingmagazine.com)
Post Metadata
WordPress lets you add arbitrary metadata to posts. Movie Ratings. Pull quotes. Geolocation.
Media: Easy Embeds
HULK NOW A JOURNALIST FOR FREEDOM! HULK SMASH LIES!
— Journalist Hulk (@iiefhwpi23hdkwd) January 10, 2011
Go Mobile
Make the mobile app a pre-requisite for all your editors and reporters.
Mobile + Real-time
Liveblogging from the field
Real-time + Curation
LivePress, ScribbleLive / CoverItLive
Curation: Zoninator
Drag-and-drop-based control
Getting Social
Comments, Sharing, Publishing
Web-first Workflows
WordPress is your central hub and content store.
Plugin: Edit Flow
Edit Flow gives you custom statuses, a calendar, editorial comments, and more, all to make it much easier for your team to collaborate within WordPress.
Edit Flow: Calendar
A convenient month-by-month look at your content
Edit Flow: Custom Statuses
Define the key stages to your workflow.
Edit Flow: Editorial Comments
Threaded commenting in the admin for private discussion between writers and editors
Edit Flow: Everything Else
- Notifications – Receive timely updates on the content you’re following.
- User Groups – Keep your users organized by department or function.
- Editorial Metadata – Keep track of the important details.
- Story Budget – View all of your upcoming posts in a more traditional story budget view, and hit the print button to take it to your planning meeting.
Knowledgebase: WP-Help
Institutional Knowledge. Newsroom Continuity.
Co-Authors Plus
Multiple bylines.
Co-Authors Plus
“Guest” Authors.
Other Options?
- Peter’s Collaboration Plugins (Notes + Emails)
- Editorial Calendar
- Email Post Changes
- Post Forking
Your WordPress, Your Way
Tailor everything to your community and newsroom’s needs and culture; there is “no one size fits all” approach.
Presentation: The Database Schema
This slideshow could not be started. Try refreshing the page or viewing it in another browser.
The Database Schema
About me: Mohammad (Mo) Jangda
mo@automattic.com | batmoo@gmail.com | @mjangda
– Toronto, ON
– Code Wrangler at Automattic / WordPress.com
– Ice Cream Fan
Outline
– API vs Database
– Core Structure & Table Walkthrough
– Points of Interest
– Pitfalls & Opportunities
Caveats
– Not really touching much on Multisite
– High-level concepts
– Ignoring links
– Assuming MySQL only
– No database tuning advice :)
API vs. Database
A Very Rich API
As devs, we don’t have to worry about schema because WordPress abstracts the database interaction for us.
The closest thing most devs need to worry about is the username and password during the install process.
Why bother?
– Because it’s interesting :)
– Important to understand the underlying architecture of the system
– Easier to debug problems and understand changes and new features
– Can help solve interesting problems
Core Structure
Core Structure
– Single-site: 11 base tables
– Multisite: 17 base tables
=== 9 additional tables for every new blog
=== 500 million tables on WordPress.com
Core Structure
– Hybrid entity/object-oriented and key-value store (posts + meta)
– Some normalization of the schema (terms)
– Unique IDs (primary key) for each entity within a table
wp_{object}s tables
Main tables are modelled after the object they are storing:
– wp_posts
– wp_comments
– wp_users
wp_{object}meta tables
Each object type has a key-value meta store:
– wp_postmeta
– wp_usermeta
– wp_commentmeta
Consistency across these tables allows for a common metadata API
key-value store
A bit like an associative array, in table form. Values relating to objects are identified by a key.
(“NoSQL”)
key-value store
– post_id: 123
– key: _thumbnail_id
– value: 456
key can any valid varchar(255); value can be any primitive or serializable object.
wp_term(s|taxonomy|taxonomy_relationships)
– Terms are handled very differently via three tables
– Somewhat messy and the source of many frustrations
Table: options
Similar key-value store to meta but only one object (blog) and no matching table for that object
Table: options
option_id [bigint(20)]
option_name [varchar(64)]
option_value [longtext]
autoload [varchar(20)]
SELECT * FROM wp_options WHERE autoload = ‘yes’
// get_option( ‘cookies’ )
SELECT * FROM wp_options WHERE option_name = ‘cookies’
Table: commentmeta
Look familiar? :)
http://codex.wordpress.org/Database_Description#Table:_wp_commentmeta
Terms
– Mapping looks something like:
^v
wp_term_taxonomy
^v
wp_term_relationships
^v
wp_(posts|links)
Table: wp_term_taxonomy
http://codex.wordpress.org/Database_Description#Table:_wp_term_taxonomy
Table: wp_term_relationships
http://codex.wordpress.org/Database_Description#Table:_wp_term_taxonomy
#1: Get the term_id from slug
SELECT wp_43654959_term_taxonomy.term_id
FROM wp_43654959_term_taxonomy
INNER JOIN wp_43654959_terms USING (term_id)
WHERE taxonomy = ‘category’
AND wp_43654959_terms.slug IN (‘stuff’)require, wp, WP->main, WP->query_posts, WP_Query->query, WP_Query->get_posts, WP_Tax_Query->get_sql, WP_Tax_Query->clean_query, WP_Tax_Query->transform_query, wpdb->get_col
#2: Get the term_tax_id
SELECT term_taxonomy_id
FROM wp_43654959_term_taxonomy
WHERE taxonomy = ‘category’
AND term_id IN (293)require, wp, WP->main, WP->query_posts, WP_Query->query, WP_Query->get_posts, WP_Tax_Query->get_sql, WP_Tax_Query->clean_query, WP_Tax_Query->transform_query, wpdb->get_col
#3: Get the term object (optional)
SELECT t.*, tt.* FROM wp_43654959_terms AS t INNER JOIN wp_43654959_term_taxonomy AS tt ON t.term_id = tt.term_id WHERE tt.taxonomy = ‘category’ AND t.slug = ‘stuff’ LIMIT 1
require, wp, WP->main, WP->query_posts, WP_Query->query, WP_Query->get_posts, get_term_by, wpdb->get_row
#4: Get the posts
SELECT SQL_CALC_FOUND_ROWS wp_43654959_posts.ID FROM wp_43654959_posts INNER JOIN wp_43654959_term_relationships ON (wp_43654959_posts.ID = wp_43654959_term_relationships.object_id) WHERE 1=1 AND ( wp_43654959_term_relationships.term_taxonomy_id IN (1) ) AND wp_43654959_posts.post_type = ‘post’ AND (wp_43654959_posts.post_status = ‘publish’ OR wp_43654959_posts.post_status = ‘private’) GROUP BY wp_43654959_posts.ID ORDER BY wp_43654959_posts.post_date DESC LIMIT 0, 10; SELECT FOUND_ROWS()
require, wp, WP->main, WP->query_posts, WP_Query->query, WP_Query->get_posts, wpdb->get_col
Term Frustration: Complicated Queries
Queries usually require one or more JOINs
– get the term_taxonomy_id of the term matching our slug and taxonomy
– get posts which match the term_taxonomy_id via the term_relationships table
Things get even messier when you want to query by multiple terms or taxonomies or a NOT IN
Term Frustrations: #5809-core
Terms with the same slug are bound together. Title or description change impacts the other. Or be forced to use the dreaded `-2` in your slug
https://core.trac.wordpress.org/ticket/5809
(Fixed early November 2014!)
Term Frustrations: No meta!
No key-value/meta store for terms!
Have to rely on hacky workarounds (store encoded/serialized in the description) or plugins (“Taxonomy Meta” or “Meta for Taxonomies”)
Good News!
There are some plans under way to simplify things:
– from 3 tables to 2
– reduce the complexity of queries with back-compat
– pave the way for taxonomy meta and better object modelling
https://make.wordpress.org/core/2014/11/12/an-update-on-the-taxonomy-roadmap/
Points of Interest
API functions for almost anything you want to do
Most API functions have actions and filters to help modifying the resulting queries (e.g. post_clauses)
add_filter( 'posts_where', function( $where ) { // kill the query! $where = ' 1 = 0'; // modify as needed! return $where; } );
$wpdb class for interacting with the database directly
global $wpdb $wpdb->insert( ... ) $wpdb->delete( ... ) $wpdb->get_results( ... ) $wpdb->get_var( ... )
(object + meta pattern)Very consistent naming
(`{object}_id`, `meta_key`, `meta_value`)
* a few inconsistencies (e.g. ID, umeta_id) :)
Dates are stored in local and GMT versions:
post_date and post_date_gmt
post_modified and post_modified_gmt
etc.
Object data columns are usually prefixed with the type
(`post_title` instead of just `title`, `comment_date` instead of just `date`)
* a few exceptions (e.g. ID)
Not all fields are still used:
`comment_karma` in comments
`to_ping` in posts
`term_group` in terms
options used to have a `blog_id` entry until 3.4)
Some fields are odd
(for comment_type default value is “”, which is a comment :))
Database changes don’t happen very often in core (can go several releases without a change).
- $wp_db_version = 27916; + $wp_db_version = 29188;
Difficult to make schema changes without risk of breaking things
– Especially harder on larger multisite installs
– pre_schema_upgrade attempts to handle the upgrade
https://core.trac.wordpress.org/browser/trunk/src/wp-admin/includes/upgrade.php#L2067
Custom database tables through plugins and themes are discouraged
– The schema was designed to be extremely flexible
– Not always perfect (e.g. post2post) but can accommodate a huge number of use cases
– But, tools available if your use case requires them (e.g. dbDelta)
Pitfalls & Opportunities
Pitfalls: Slow/Complex queries
– WP_Query is very powerful; you can do some insane lookups
– At scale, these insane lookups can break your site
– If you’re not careful, these insane lookups can break the site even with very little traffic
Pitfalls: Slow/Complex queries
– Non-indexed or expensive queries can be a problem
– The structure of the taxonomy system lends itself to slow queries
– meta_key- or meta_value-based queries on a really large tables are slow
Pitfalls: Schemaless Meta
– Some might argue that the flexible schema leads to bad application/site design
– Others also argue that less thought put into how meta will be used
Opportunity: Consistency
– Know what exactly to expect between installs
– Your WordPress is the same as my WordPress
Opportunity: Flexibility
– Custom post types!
– Any manner of object can be stored as custom “posts”
– Keyed off the “post_type” column
Opportunity: Flexibility
– wp_term_relationships does not have a strict definition of what an object_id is
– Used for both posts and links!
– Could technically use for users as well
Opportunity: alt use cases
– Using taxonomy as a post-to-post connector (instead of meta)
– https://github.com/mjangda/taxonomy-to-post_type-sync
Opportunity: alt use cases
– Machine-generated objects as post_types
=== Redirects via WP.com Legacy Redirector
=== DNS records on WordPress.com
=== Sitemaps in Comprehensive Sitemaps
$args = array( 'post_name' => $from_url_hash, 'post_title' => $from_url, 'post_type' => self::POST_TYPE, 'post_parent' => $redirect_to, ); wp_insert_post( $args );
Opportunity: alt use cases
– Liveblog entries as comments
=== Completely custom UI for interacting with data
=== All abstracted into its own API (and using the WordPress API) to map data to database fields (WPCOM_Liveblog_Entry, WPCOM_Liveblog_Entry_Query)
Summary
– The schema provides a good object model + key-value store for data
– It is extremely flexible and powerful
– Not something you should really have to worry about or deal with
– It can present problems if we are not careful
– It can reward us very well if we use it as it was intended to be
Hiring, Hiring, Hiring!
– VIP Wranglers!
– Code Wranglers!
– Designers!
– Happiness Engineers!
– Interns!
Thank You! Questions?
mo@automattic.com | batmoo@gmail.com | @mjangda
Protected: Presentation: 2013 In Review [VIP Workshop 2014]
Protected: Workshop: Server-side Performance [VIP Workshop 2014]
Presentation: Deploys at WordPress.com VIP
This slideshow could not be started. Try refreshing the page or viewing it in another browser.
Deploys at WordPress.com VIP
About me: Mohammad (Mo) Jangda
mo@automattic.com | batmoo@gmail.com | @mjangda
- Code Wrangler at Automattic
- Ice Cream Fan
What is WordPress.com?
Hosted platform for running blogs, sites, etc. using WordPress.
Big multisite network.
What is VIP?
WordPress hosting and services for big companies (CNN, TIME, ESPN, CBS, TechCrunch, Williams, etc.)
A VIP site on WordPress.com is the same as a free site, except with some custom code a bit of magic sauce.
The VIP team works with external developers providing code review, developer support, etc.
Fancy Numbers
54 million sites
62 million users
30 million posts (per month)
13 billion pageviews / 400 uniques (per month)
500 million MySQL tables
2500 servers
3+ DCs
How Many Deploys?
Yesterday: 259
VIP: 165
WP.com: 75
Other: 19
How is code deployed on WordPress.com?
Production servers run trunk
– Commit changes to trunk
– Run deploy script
– Deploy script syncs svn mirrors across DCSs and runs `svn up` on all servers

$ deploy wpcom
Going to update from 93786 to 93787 for /public_html/
Syncing wpcom SVN Mirrors
DFW (1s)...
IAD (2s)...
SAT (2s)...
Deploying wpcom revision 93787
Deploying to static webs
SAT (1s)...
IAD (1s)...
DFW (1s)...
Deploying to dynamic webs
DFW (5s)...
IAD (7s)...
SAT (9s)...
What about VIPs?
Similar to WordPress.com except:
– Code comes from external developers
– Deploy done internally
– We only push the particular folder (code changes are limited to the “theme”)
VIP Numbers
5 Software Engineers; 1 Happiness Engineer; other biz people
7m lines of code (on top of WP.com codebase)
1000s of sites
~100 active developers
Over 140K commits (currently r140321)
70K deploys all time
Avg deploy time (commit => review => deploy): 130 minutes
Challenges
– External developers writing PHP, HTML, and JS so anything is fair game
– Issues on one site can spill over to another (also helps with scaling)
Code Review is Essential
Make sure they’re not doing not-so-good things
Guidelines: http://vip.wordpress.com/documentation/code-review-what-we-look-for/
Challenge: Code Review is hard and time-consuming! Blocks business needs!
Deploy Page
– custom built tool to optimize review process
– Used to track pending commits, review changes, and deploy
– Can send feedback on issues, alert rest of team when issues spotted
– Handy revert commands!
– “real-time”
Pre-Deploy Tests
Run when deploy button is pressed.
Loads up the site against a sandbox server in production with live database (crazy!) and latest code.
Static Analysis
Using custom scanning tools + PHP Code Sniffer to catch things like restricted functions, bad coding patterns/practices, etc.
Coming soon: post-commit tests
Using Jenkins to run tests immediately after commit, instead of on-demand using Sandbox server in production.
Other Tools
– Reference Page with handy revert commands
– IRC Channel piping PHP and MySQL errors from 2 servers
– IRC alerts for internal commit and deploy notifications
– Email and webhooks for external commit and deploy notifications
Future: Scaling
– More developers and automation
– Externally initiated deploys and post-deploy reviews
Hiring, Hiring, Hiring!
- VIP Wranglers, VIP Wranglers, VIP Wranglers!
- Code Wranglers, Code Wranglers, Code Wranglers!
- Designers, Designers, Designers!
- Happiness Engineers, Happiness Engineers, Happiness Engineers!
WordPress 3.6
3.6 is finally out! Glad that I could contribute to a WordPress release again!
Presentation: Caching; for fun and profit
This slideshow could not be started. Try refreshing the page or viewing it in another browser.
Caching; For Fun & Profit
Understanding different caching tools and techniques available to WordPress developers such as the Transient and Object Caching APIs and how/why they can make or break your site.
About me: Mohammad (Mo) Jangda
mo@automattic.com | batmoo@gmail.com | @mjangda
- Toronto, ON
- Code Wrangler at WordPress.com VIP (Automattic)
- Core Contributor
- Ice Cream Fan
Hiring, Hiring, Hiring!
- VIP Wranglers, VIP Wranglers, VIP Wranglers!
- Code Wranglers, Code Wranglers, Code Wranglers!
- Designers, Designers, Designers!
- Happiness Engineers, Happiness Engineers, Happiness Engineers!
What is caching?
Caching is more than just installing a plugin to fix your uptime problems.
It’s a way to temporarily store data so that it can be reused.
What is caching?
It’s meant to avoid doing the same expensive computations (fetching from the database, making remote calls, etc.) over and over again.
Put something in a nearby place so can access it more readily.
Can cache all types of things: objects, arrays, strings, integers, etc.
Analogy: A Fireplace, A Lumberjack, and A Pile o’ Wood
So What?
According to Google:
High performance web sites lead to higher visitor engagement, retention and conversions
Translation:
Fast sites == $$$
What is a slow pageload?
A profit killer.
What is a slow pageload?
Inevitable (as your traffic and site grows).
What is a slow pageload?
An insult to your users. Your users appreciate/want/need/deserve a fast site.
Goal: This, or better
The Super Secret Technique To A Really Fast Site
The Fallback: Caching
Caching: More than just for speed
Caching will help prevent your site from going down if you rely on external services. If Twitter goes down, your site can too.
Caching: More than just for speed
Will force you to think about and write better code. And become a better developer.
The Cache Loop
Most caching interactions follow a simple loop-based pattern:
- Get the value we want from cache!
- Did we get it?
- If “no”: get or generate the value and cache it!
- Use the value!
Other patterns exist, but is the simplest, most common one.
The Cache Loop
value = get from cache if ( value not found ) { generate value save value in cache } do stuff with value
Different types of caching
- Variable/static caching
- Object caching
- Fragment caching (pieces of rendered HTML or output for a page)
- Opcode caching (PHP Acceleration)
- Full-page caching (the whole rendered page)
- Asset Caching (CDN)
Full-page caching
Your first and easiest line of defence. Store the entire generated page in cache and serve to users when they visit.
* Super Cache
* W3 Total Cache
* Batcache (used on WP.com; pages served in 0.004 seconds)
Harder now that we have more personalized, user-centric sites.
The quality of your code will impact how effective a full-page cache is
Code Quality is important
Caching/optimizing at the application-level is super important.
Will touch on this later.
Opcode Caching
APC, XCache, Zend Optimizer+, etc.
Improves the actual PHP-level load times of your application/site by storing compiled versions of your PHP files.
Can significantly improve load times if your site has thousands of files (or using large plugins like BuddyPress).
Variable Assignments
Is this better?
if ( get_the_terms() ) { foreach ( get_the_terms() as $term ) { … } }
Or this?
$terms = get_the_terms(); if ( $terms ) { foreach ( $terms as $term ) { … } }
Variable Assignments
Variable assignments are a type of caching.
You store a value in memory so it can be re-used.
Caching with the static
var
Useful if you repeat the same function over and over again but don’t want the value to persist between pageloads.
function x_is_it_true() { static $is_it_true; if ( ! isset( $is_it_true ) ) $is_it_true = call_really_expensive_function(); return $is_it_true; }
Little did you know…
WordPress has caching built-in!
WordPress has caching built-in!
A built-in, non-persistent caching API.
On any given pageload, you’re going to to need to fetch a single option potentially tens of times. And WordPress tries to optimize for that.
Note: it is non-persistant!*
Here’s how WordPress caches options:
-
Fetch all
autoload = yes
options:SELECT option_name, option_value FROM wptrunk_options WHERE autoload = 'yes'
- Add the results to object cache (which is really just a global array)
- Next time an option is needed, look in the cache first
- If not found, grab from the database and update the array.
Code Snippet from get_option()
$alloptions = wp_load_alloptions(); // wp_load_alloptions fetches autoload = yes if ( isset( $alloptions[$option] ) ) { $value = $alloptions[$option]; } else { ... $row = $wpdb->get_row( $wpdb->prepare( "SELECT option_value FROM $wpdb->options WHERE option_name = %s LIMIT 1", $option ) ); ... }
WordPress caches lots of things for you!
* options
* posts
* users
* queries
* terms
* etc.
Object Caching API
This is the API WordPress uses and makes available to developers:
/** * Retrieves the cache contents from the cache by key and group. * * @param int|string $key What the contents in the cache are called * @param string $group Where the cache contents are grouped * * @return bool|mixed False on failure to retrieve contents or the cache * contents on success */ function wp_cache_get( $key, $group = '' ) {
Object Caching API
/** * Saves the data to the cache. * * @param int|string $key What to call the contents in the cache * @param mixed $data The contents to store in the cache * @param string $group Where to group the cache contents * @param int $expire When to expire the cache contents * * @return bool False on failure, true on success */ function wp_cache_set( $key, $data, $group = '', $expire = 0 ) {
Object Caching API
/** * Adds data to the cache, if the cache key doesn't already exist. * * @param int|string $key The cache key to use for retrieval later * @param mixed $data The data to add to the cache store * @param string $group The group to add the cache to * @param int $expire When the cache data should be expired * * @return bool False if cache key and group already exist, true on success */ function wp_cache_add( $key, $data, $group = '', $expire = 0 ) {
Object Caching API
/** * Removes the cache contents matching key and group. * * @param int|string $key What the contents in the cache are called * @param string $group Where the cache contents are grouped * @return bool True on successful removal, false on failure */ function wp_cache_delete( $key, $group = '' ) {
Difference between wp_cache_add
and wp_cache_set
?
If the value already exists, add
will bail; set
will override.
A few other intricacies as well, although, don’t worry about them unless you’re building really big sites.
The Cache Loop: In Action!
// First, check to see if the value is in the cache $featured_posts = wp_cache_get( 'my-featured-posts' ); // Did we find it? if ( false === $featured_posts ) { // Nope! Let's generate and cache it! $featured_posts = get_posts( array( 'post__in' => get_option( 'sticky_posts' ) ) ); wp_cache_set( 'my-featured-posts', $featured_posts ); } // Cool, now we have our value; let's use it! foreach ( $featured_posts as $post ) { ...
Warning: Careful with the false
and error conditions!
You should account for those or be prepared for pain during failures.
$tweets = my_get_twitter_tweets( '#yolo' ); if ( $tweets ) { wp_cache_set( 'yolo', $tweets, 'my-tweets' ); }
Better Handling Error Conditions
$tweets = my_get_twitter_tweets( '#yolo' ); if ( $tweets ) { wp_cache_set( 'yolo', $tweets, 'my-tweets', MINUTE_IN_SECONDS * 10 ); } else { wp_cache_set( 'yolo', array(), 'my-tweets', MINUTE_IN_SECONDS * 1 ); }
Warning: Prefixing rules still apply
If you run into conflicts, you’re gonna have a bad time!
Warning: By default, caching in WordPress in non-persistent.
This means that your cache is emptied and re-built every time a new page is loaded.
Persistent Object Caching
You can make the data persist using an object caching backend like memcached:
http://wordpress.org/plugins/memcached/
This will retain objects in cache across pageloads, so you don’t need to hit the database again.
There are other caching backends as well like APC.
If you don’t have access to object cache…
Transients
If you’re a plugin/theme author, there’s no guarantee of the environment your users are in and whether persistent object caching is available.
WordPress Transients API helps you cache things without needed a caching backend.
Transients
The data:
a) persists across pageloads; and
b) expires after a set period of time (hence the name).
Useful for things like remote data or data that you know has a limited time span.
The Transients API
function get_transient( $transient ) {} function set_transient( $transient, $value, $expiration ) {} function delete_transient( $transient ) {}
The Cache Loop: In Action!
// First, check to see if the value is in the cache $remote_data = get_transient( 'my-remote-data' ); // Did we find it? if ( false === $remote_data ) { // Nope! Let's fetch and cache it! $response = wp_remote_get( 'http://foo.com/file.txt' ); $remote_data = json_decode( wp_remote_retrieve_body( $response ) ); set_transient( 'my-remote-data', $remote_data ); } // Cool, now we have our value; let's use it! foreach ( (array) $remote_data as $tweet ) { ...
Warning: Don’t overuse transients
* they’re stored in the options table and can bloat; and
* they require database calls to fetch.
Famous Last Words
“Oh, don’t worry. I’ll just put it in cache/transient…”
I worry. A lot.
Caching helps but…
You need to be smart about and understand what your application is doing.
A slow query will always be slow regardless of how many layers of caching you add around it.
Cache Stampedes Will Break You
Cache stampede: when a load of requests in succession kick of the same expensive process because all of them hit an empty cache on or around the same time.
Too many lumberjacks chopping the same wood
Cache Stampede: Solutions
- Locking: Prevent the pileup
- Async: Separate the cache generation process
…OR FIX (REMOVE) THE DAMN THING!
The Uncached (“cold cache”) Pageload
Optimize for the worst case scenario: aim for the best by optimizing for the worst (although, within realistic means).
Uncached pageloads should be as slim as possible.
The Uncached (“cold cache”) Pageload
Technique: Kill your object cache and examine load time patterns on a few different types of pages:
- homepage
- category
- single
- 404
- any special or complex templates
Examine the Uncached Pageload
What’s the load time like?
- If it times out, that’s a red flag.
- If the pageload time is inconsistent, that’s a red flag.
- If the pageload time is unusually high, that’s a red flag.
Summary
- Caching in WordPress is “easy”: many things built-in; many ways to do it
- Understanding the intricacies of who, what, when, where to cache is hard!
- There’s a lot more to caching than I’ve covered here!
- If used well, caching will help keep your site alive, fast, and $$$!
Protected: Presentation: Make It Faster II
Presentation: How to think and work like a VIP Developer
This slideshow could not be started. Try refreshing the page or viewing it in another browser.
How to think and work like a VIP Developer
About me: Mohammad (Mo) Jangda
mo@automattic.com | batmoo@gmail.com | @mjangda
- Toronto, ON
- Code Wrangler at WordPress.com VIP (Automattic)
- Core Contributor to WordPress
- Ice Cream Fan
Warning: #humblebrag
Please also excuse my terrible attempts at humour.
How Does VIP Work?
Automattic: Hosting & Support; Code Architecture and Review; Troubleshooting; Best Practices; Upgrades; etc.
Client or Partner: Editorial & Content; Development; Maintenance; etc.
There are a few simple formulas when it comes to scalability…
Big site + Lots of traffic
=
:(
Big site + Lots of traffic
+ Good infrastructure
=
:)
Big site + Lots of traffic
+ Good infrastructure
+ VIP Developer(s)
=
:D :D :D :D :D
Why does this Formula work?
Because VIP Developers are some of the best WordPress developers in the world!
Who are these mythical VIP Developers?
In a very literal sense: any developer who works on a VIP site.
VIP Developers work at various agencies
…or they work for the VIP sites
What’s so special about them?
- Their code scales! (Millions of pageviews? Bring it on!)
- Their code is secure (Hackers feel inadequate!)
- Their code is future-proof! (Always
trunk
-ready!) - Their code is clean, readable, extensible, modularized, well-documented, etc.
- Their code will make you a sandwich, if you ask nicely enough.
Most Importantly They’re Ruggedly Handsome!
The Key Question:
How does one become a VIP Developer?
(In the figurative sense)
Here’s the secret:
A VIP Developer is great at software development.
Qualities of a Great Software Developer
- Good Technical Aptitude
- Humble
- Open to feedback and improving on past mistakes
- Never stops learning
- Can break down, understand, and articulate problems
- Follows best practices and guidelines (and know when to break them)
- Can admit when they’re doing something wrong
Notice how I haven’t mentioned WordPress anywhere in the list?
Great Software Developers:
Understand and Follow good software development practices
- You don’t need to be a computer scientist
- Software design processes (e.g. Agile), patterns (e.g. Factories), principles (e.g. DRY)
- Think beyond the requirements
Notice how I still haven’t mentioned WordPress anywhere in the list?
Great Software Developers:
Are platform independant
- Expose yourself to non-WordPress-y things
- WordPress can power the Mars lander if we really want it to but that doesn’t mean we should.
- Explore other languages (beyond PHP)
- Explore libraries and other frameworks (beyond jQuery)
- Why? You can flex that muscle if a project comes up with that one odd requirement
- Or you can bring the thinking of other frameworks into your WordPress development
A VIP Developer’s Rules to Live By
The Most Important One:
Don’t make Mo mad!
(He will take away your commit access!)
Rule: Become a WordPress expert
- You don’t need to know everything
- But you do need to know the key components and how they interact with each other
- Example: what pieces of WordPress interact during any given pageload
- Helps with troubleshooting, debugging, feature development
Rule: Become Intimate with Code
- Spend more time with code than is healthy (have the Trac and SVN URLs bookmarked)
- Know what code lives where (or how to find it)
- Use great resources (e.g. Adam Brown’s Hook and Filter Database)
Rule: Follow core development
…or, better, contribute!
Rule: Never, ever modify WordPress core
Rule: Debug Often
And know what tools to use
define( 'WP_DEBUG', true );
Developer Plugin (http://wordpress.org/extend/plugins/developer/)
Rule: Don’t copy-pasta code without reviewing and understand what it does
Rule: Do things the “WordPress way”
- Use WordPress’ built-in APIs wherever possible
- If you need to access the database, use $wpdb
- Fetching Remote data, use the HTTP API, e.g. wp_remote_get
- http://codex.wordpress.org/WordPress_API’s
Rule: Always write secure code, even if it takes longer
Security should be a part of the development process, not an after-thought.
You eventually get to a point where all the code you write is inherently written with security in mind.
Rule: Follow the WordPress Coding Standards
Rule: Prefix all the things!
To prevent the dreaded WSOD!
Rule: Use source control
- Doesn’t matter what system (git, svn, mercurial)
- Think in terms of changesets and small commits
- Benefits: Instant backup, experimentation, disk space (.bk files), teams
Rule: Get your code reviewed often
…and use it as an opportunity to grow!
Rule: Give back to the community
Because it feels good! And it’s good for you too!
Rule: Know when to break the rules
Caveat:
Developing this takes alot of time, effort, willpower, determination, etc.
Thanks!
-
Follow
Following
Already have a WordPress.com account? Log in now.