26 ene 2011

SEO advice: url canonicalization

SEO advice: url canonicalization




Q: What is a canonical url? Do you have to use such a weird word, anyway?
A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls:

* www.example.com
* example.com/
* www.example.com/index.html
* example.com/home.asp

But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set.

Q: So how do I make sure that Google picks the url that I want?
A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links.

Q: Is there anything else I can do?
A: Yes. Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com/, it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.).

Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com?
A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded.

Q: I noticed that you don’t do a 301 redirect on your site from the non-www to the www version, Matt. Why not? Are you stupid in the head?
A: Actually, it’s on purpose. I noticed that several months ago but decided not to change it on my end or ask anyone at Google to fix it. I may add a 301 eventually, but for now it’s a helpful test case.

Q: So when you say www vs. non-www, you’re talking about a type of canonicalization. Are there other ways that urls get canonicalized?
A: Yes, there can be a lot, but most people never notice (or need to notice) them. Search engines can do things like keeping or removing trailing slashes, trying to convert urls with upper case to lower case, or removing session IDs from bulletin board or other software (many bulletin board software packages will work fine if you omit the session ID).

Q: Let’s talk about the inurl: operator. Why does everyone think that if inurl:mydomain.com shows results that aren’t from mydomain.com, it must be hijacked?
A: Many months ago, if you saw someresult.com/search2.php?url=mydomain.com, that would sometimes have content from mydomain. That could happen when the someresult.com url was a 302 redirect to mydomain.com and we decided to show a result from someresult.com. Since then, we’ve changed our heuristics to make showing the source url for 302 redirects much more rare. We are moving to a framework for handling redirects in which we will almost always show the destination url. Yahoo handles 302 redirects by usually showing the destination url, and we are in the middle of transitioning to a similar set of heuristics. Note that Yahoo reserves the right to have exceptions on redirect handling, and Google does too. Based on our analysis, we will show the source url for a 302 redirect less than half a percent of the time (basically, when we have strong reason to think the source url is correct).

Q: Okay, how about supplemental results. Do supplemental results cause a penalty in Google?
A: Nope.

Q: I have some pages in the supplemental results that are old now. What should I do?
A: I wouldn’t spend much effort on them. If the pages have moved, I would make sure that there’s a 301 redirect to the new location of pages. If the pages are truly gone, I’d make sure that you serve a 404 on those pages. After that, I wouldn’t put any more effort in. When Google eventually recrawls those pages, it will pick up the changes, but because it can take longer for us to crawl supplemental results, you might not see that update for a while.

That’s about all I can think of for now. I’ll try to talk about some examples of 302′s and inurl: soon, to help make some of this more concrete.

Introducing the Android Emulator, managing Android Virtual Devices (AVD) | Hello Android

The emulator available in the Android SDK is not just a tool that allows you to easily test applications without having to install it to a real device, or even having one. With the proper configuration it is possible to test situations which are hardly reproduced on a physical one.

After installing the android plugin and SDK in eclipse an icon is automatically placed on the toolbar to quickly access the Android SDK and AVD (Android Virtual Device) manager.

For an AVD you can set the

screen resolution
Android version
SD card size
and various hardware availability and properties such as GPS, RAM, accelerometer, camera, cache etc.

When you press the new hardware button, it pops this dialog to select the hardware and aftred adding it you can change value attribute for it. The value often is a boolen to set if the hardvare is available or not, but it can for example an integer to set the size for the RAM.

Of course it you set a hardware like GPS available on the emulator the running application will detect that it is available, but this 'virtual' GPS naturally can not provide a GPS coordinate on its own. Like we discussed it in a previous tutorial a coordinate can be sent by using telnet in the command line. This goes for similar hardware too.

All of this makes possible for example to test behavior of the application when SD card has not enough space, the memory runs low, or how does the user interface fit in different resolution.
The backwards compatibility is another important question. Each project has android versions meant to be compatible with. This can be validated easily since all the previous Android versions can be downloaded. It is an essential test, because the available API versions for development and the mostly used versions worldwide can greatly differ.

11 ene 2011

How to extend XSLT using built in extension functions. – Intel Software Network Blogs

XSLT 2.0 and to some extent 1.0 are powerful languages when it comes to transforming documents and even for performing some tasks. But, as is often the case, to do something odd or unusual can often be impenetrable or just plain difficult. One of the advantages of using Intel® SOA Expressway is that most of the extension functions we have written to make configuration easier for BPEL based workflow are also available to the XSLT developer.
For those not familiar with SOA Expressway extension functions, they are granular operations that can be performed on the contents of messages or XML / JSON documents which SOA Expressway can embed into XPath or XSLT. What they add up to is a Swiss Army Knife for doing all sorts of useful things, especially when SOA Expressway is used in some message mediation or security mediation capacity.
The range of functions encompasses:
  • digest generation (MD5, SHA, etc.)
  • exslt functions for dates and regular expressions.
  • crypto and canonicalization.
  • full digital signature generation and verification.
  • encoding and decoding to binary, base64 etc.
  • timestamping, UUID generation, random numbers.
  • cookie and authentication token handling.
  • MIME attachment get and set.
Okay I could go on; there were more than two hundred functions the last time I counted. Go to our site at www.dynamicperimeter.com and request the full documentation set to find out more.
So how does an extension function get used in everyday life?
Here's how to write a message to the transaction log from within your XSLT. I'm assuming you have constructed a basic workflow and already have an XSL Transform action within it.
The basic form would look like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:soae-xf="http://www.intel.com/soae/xpath/">
<xsl:variable name="log" select="soae-xf:write-transaction-log('info',concat('Transaction ID:  ',soae-xf:get-transaction-id(),'; Comment: ','test',';'))"/>
<xsl:template match="/">
<!-- The variable is parsed lazily and is only evaluated when it is used in the test below. -->
<xsl:if test="$log"></xsl:if/>
<xsl:apply-templates /&g;
</xsl:template>
</xsl:stylesheet>



 There are three parts to remember:
1, Make sure your transform has the soae-xf, exslt or soae-cache namespace declared as appropriate (shown in blue).
2, Declare your Extension Function with a variable. In this case $log (shown in green).
3, Do something with the variable to force the evaluation of the variable. In this case we test $log for some contents. This is a necessary step since one of the performance features of the XSLT engine is lazy parsing which eliminates the evaluation of variables which may turn out to be unnecessary.
Interoperation between the workflow variables and execution steps and the nitty gritty of XSLT is necessary because it gives the developer added flexibility when it comes to mediating messaging in a product that's used as a gateway or ESB.

Chintan Shah's Blog: XSLT Troubleshooting

XSLT Troubleshooting

May be it is just based on my experience, but I feel in general XSLT doesn’t have lot of troubleshooting support. E.g. in Java, no matter what JVM you use, you can pretty much get stack trace (even in NPE errors). In XSLT, based on XML parser you use, sometime you get line numbers and some time you don’t, plus it doesn't provide lot of details in log. Moreover there are not really any good tools to do much troubleshooting. In AIA world, it gets even crazier. Most of the XSLT functions come with AIA just runs only on server side, so you are hopeless and helpless when you want to do some troubleshooting on your local box. It gets even worse, when you get FOTY0001 type errors. Sometime you get more details in log file when there is FOTY0001, and sometime you don't.

Here are couples of ideas which can make your life easier.

1) XSL comment
You can put XSL comment in your code, so it will show up when XSLT is completed. It does help a lot when you are trying to just print values of variables. Pretty much all credit goes to Mahesh Narayanaswamy (absolute XSL genius) for this suggestion.

2) XSL out
This was very cool idea came out of disucssion with Amol Vaidya. XSL comment didn’t help much when XSLT was completely breaking and we were unable to get the line number or the root cause. All we needed is the similar capability like System.out inside the XSL, so we know how far in XSL we went and what are the variables. We just want to get all information before the point of failure. A custom XSLT log function came pretty handy for putting trace log in your XSLT. This way we can get as much information about the XSLT before the point of failure.
We wrote custom XSL function called XSLTLog, and registered that function in BPEL RT and ESB XSL Library.

3) Java code
I also wrote a Java code which can take XML input and execute the XSL. We have to register this Java code to the server as lot of AIA XSL functions relies on server side components. We also took it further to find exact line number in XSL using binary search approach.