Forum:New Analytics Tool: Introducing ParserSpeed

From the RuneScape Wiki, the wiki for all things RuneScape
Jump to: navigation, search
Forums: Yew Grove > New Analytics Tool: Introducing ParserSpeed
Archive
This page or section is an archive.
Please do not edit the contents of this page.
This thread was archived on 16 April 2013 by Cåm.


Hello!

I am pleased today to announce to this community the introduction of an alpha-level tool, ParserSpeed, which can help this community continue to share content quickly and effectively. The tool, available at [[Special:ParserSpeed]], allows users to see how long pages on the wiki take to render. Longer pages with lots of images and templates will naturally take longer to render than shorter articles. With this tool, you will be able to identify which pages on your wiki are the slowest to render, and see specific details as to what might be the cause. Identifying what causes the slow page loads is the first step towards speeding them up.

Why is page load speed important?

This may seem like a silly question. ‘This is the interwebs! It must go faster, faster!’ But it is true and actually quite remarkable how quickly we can find ourselves losing patience if a website takes even a single second to load longer than normal. Longer page loads create a significant difference in your perception of a website. We can split up its effect on website perception in three areas:

  • User experience - The speed at which the page loads has an impact on what the user gets out of that particular page. Since many users come from search, their first experience may be on a specific page. If that page is slow to load, chances are they will expect all pages on your wiki to be the same. This impression can go a long, so keeping pages as fast as possible will reduce this risk.
  • User perception - This is how a user feels about a site after a number of interactions with it. Much like user experience, their long-term perception of the site may be willing to trade some site lag for valuable content. The trade-off threshold is high however (the content must prove to be immensely unique and valuable) for the user to completely forget or ignore the site speed. Ultimately, if they find a site they feel has a better trade-off (speed vs. content), they will begin to rely upon that site more.
  • Technical perception - Google does factor site speed into its search ranking algorithms. That’s why it’s essential for wikis to make sure they achieve a balance between having a lot of information on a page that a user might land on Google from versus having so much information that it slows down page load and thus drags down the page rank. Even if a page on a wiki has more useful content than the first four search rankings, chances are the searcher will either give up after four fruitless results or feel as though they’ve reached the maximum available information available before ever finding result #5.

What Are Things to Avoid

As I mentioned earlier, it’s a tricky balance isn’t it? The more essential to canon and/or interesting information exists about the canon subject, the more important it becomes to collect and share information about a subject in a meaningful and detailed way. From DPL queries to photo galleries, there are a variety of ways built into wikis to effectively share information. But each one has a technical cost.

I’m not going to suggest this community should dramatically change how pages are coded or the way you organize information. However, I do just want to point out what things are generally best to avoid.

  • Really, really long lists - such as long lists of internal links or huge galleries of images. While having a lot of information on one centralized list is often useful, your wiki may be better served by splitting those pages into three or four different pages that are more readable and friendly to the parser and user.
  • Deep, complex templates (a.k.a. “[[w:c:inception|Inceplates]]”) - such as templates which call templates, which call templates, which call templates, etc. Don’t have parameters just for the sake of having parameters. Hard-code whatever makes sense and try to not make templates to be a catch-all for information.
  • Expensive Parser Functions - There are a number of functions that are considered technically expensive to render because they involve doing a non-cacheable query in the database. Those functions are {{#ifexist:}}, {{PAGESINCATEGORY}}, and {{PAGESIZE}}. While each function is very useful in doing a dynamic display of data, even just a handful can notably slow down a page’s load time.
  • Multiple DPL & Semantic MediaWiki queries - DPL and Semantic MediaWiki are great tools most of our big wikis for conditional logic. But like an expensive parser function, these tools make complex queries to the database. Try not to have multiple queries on the same page. Use caching in DPL at all times. If a table of information is fairly static, maybe it makes more sense to go ahead and code that page out manually as opposed to having a query slow the page down.
  • MediaWiki messages - Since MediaWiki messages are loaded on each page, don’t use the MediaWiki namespace for parser functions or template calls. Keep that namespace simple.

Pages on this Wiki

So let’s take some of the tips from the first section into practice for this wiki. I want to point out the three slowest pages in the main namespace and explain why they take longer load times than others

The three NS:0 pages it takes the longest to parse on RuneScape Wiki are:

How to Use ParserSpeed

First, I want to say that we will be having a GoToMeeting webinar to discuss ParserSpeed and general site speed principles on Thursday, April 4th at 10:00 PM UTC. That’s 3 PM PDT (American West Coast), 6 PM EDT (American East Coast) and 11 PM BST (United Kingdom). We will answer questions on this post as well until then, but we also just want to have a live demo and interactive session to allow you to ask questions of the Wikia staff. You can register for here - https://www3.gotomeeting.com/register/987185502

Since this is an alpha-level feature, we haven’t gotten to add a few things like a glossary yet so I want go ahead and review what each column represents:

  • Avg. Parsing Time - The average time it takes for Wikia’s servers to parse the page in the last 30 days.
  • Min. & Max. Parsing Time - The minimum and maximum times it took for Wikia’s servers to parse the page in the last 30 days (different parsing times depend on size and complexity of parameters).
  • Wikitext Size - The size (in KB) of the saved wikitext.
  • HTML Size - The size (in KB) of the saved wikitext when converted to HTML.
  • Exp. Functions - The number of expensive parser functions on the page.
  • Node count - Number of HTML elements a page is outputting.
  • Post-expand size - The size of the page after parameters are inserted.
  • Template argument size - Best explained here.

Part of an alpha-level release is to not only illustrate the functionality of a feature but to gather feedback early in the process, allowing us to add more features or change features to meet user needs and requests. We very much want to hear how your community will use the tool, what functionality you feel is either lacking or redundant, or any other constructive feedback you would like to share. --DaNASCAT<staff /> (help forum | blog) 18:56, March 29, 2013 (UTC)

Discussion

It would appear I am #1. MolMan 19:00, March 29, 2013 (UTC)

Character count would be neat Ronan Talk 19:03, March 29, 2013 (UTC)

Are we able to view only certain namespace(s)? I am The Mol and I own the slowest legitimate page on the wiki. 19:06, March 29, 2013 (UTC)

I'm going to put aside my newfound awesomeness real quickly and make some serious comments. Clearly there are some bugs, but it is still a pretty cool tool. It seems most of our large pages (looking at the ones correctly labeled as large) are Grand Exchange related or maintenance. Obviously the maintenance pages don't matter but perhaps me, cook, or some other DPL nerd could look into splitting up or otherwise optimizing those GE pages that implement DPL and apply the same treatment to those that don't. MolMan 19:37, March 29, 2013 (UTC)

Every time I parse it, it goes down. I still, however, maintain the highest "Max. parsing time". MolMan 19:46, March 29, 2013 (UTC)

Comment - I think it looks nice. Some improvements imo

  • Sort by namespace
  • Give the page a title
  • View a certain page
  • Perhaps allowing to transclude the result in a way like {{Special:ParserSpeed/RuneScape}} to be able to display what the load speeds are of certain pages so that it can be placed on a maintenance page for observation?
  • Maybe allowing the page to be viewed by all users, or maybe just the logged in ones

Thanks TimmyQ: Lord of the Mice. svco4bY.png3Gf5N2F.png 22:05, March 29, 2013 (UTC)

Sort by namespace exists, but there's no choose namespace or only use this namespace; it just goes by the number you can get with {{ns:}} MolMan 22:08, March 29, 2013 (UTC)
Better namespace filtering is definitely the first thing I've heard requested, so it will to the top of the request queue. TyA, can you explain "View a certain page" more? Not sure what exactly you mean by that. The result transclusion may be a bit much for now, but it definitely sounds like a good "wishlist" item. --DaNASCAT<staff /> (help forum | blog) 23:32, March 29, 2013 (UTC)
I think he means (and I agree with him if I am correct) the ability to search a specific page to get its stats instead of having to troll through all of the listed pages. MolMan 23:33, March 29, 2013 (UTC)
Mol is correct. svco4bY.png3Gf5N2F.png 23:47, March 29, 2013 (UTC)
Adding a filter similar to Special:AllMessages would probably suffice. cqm 00:46, 30 Mar 2013 (UTC) (UTC)

Comment - Mildly amused at how the forum archives are some of the top pages on that list, although it's something I'm very much aware of. Template transclusion is probably a little hard to get away from on most pages, other than migrating to SMW or similar. I'm interested in how the size of the HTML in relation to wikitext is something we can reduce. Most of it is part of the wikitext parser for the majority, surely? Excluding [https://runescape.wiki/wiki/Random_page Random page], and other such bad practices.

Expensive functions are something I've seen mentioned before, but don't really know if they're significantly bad things to use (if they were why can we use them?) or if it's something we should try and remove dependency on or something we don't need to worry about unless there's a dozen or so of them on a single page. Is there a viable replacement for #ifexist that isn't so server intensive? cqm 00:46, 30 Mar 2013 (UTC) (UTC)

Expensive parsers are usually awesome, but they take a heavy toll on the servers. They're not significantly bad and only in large amounts do they cause any visible lag. To give you an idea: Every item page uses if exist once or twice. If exist isn't that great to be honest and I'd like to just limit our use of it. It would be nicer if it didn't create technical links from use. But as for expensive parsers as a whole, they are the least contributory to the speed problems at the wiki, at least as far as I know. MolMan 01:09, March 30, 2013 (UTC)
Cåm, there is a point where MediaWiki warns you that there are "too many" expensive parser functions, but that doesn't mean it's perfectly fine up to that point. Each expensive parser function does take computing time. I believe the expensive parser function limit is 100, but that doesn't mean that 99 exp. parser functions have no effect. (Man, that sounds like a good drinking song ready to be written about expensive parser functions on the wall).
We continue to add feedback from our conversations with the alpha wikis. Please also remember we will be having a webinar about this tool and general site parser speed basics a few hours from now. Everyone is free to register! --DaNASCAT<staff /> (help forum | blog) 16:33, April 4, 2013 (UTC)
Mix it up with centurion (wikipedia really does have everything). cqm 19:59, 4 Apr 2013 (UTC) (UTC)

Closed - Discussion has died down. cqm 10:21, 16 Apr 2013 (UTC) (UTC)