In the category 'Web 2.0, SOA, and Web Services': Facebook will launch tag suggestions, a feature that uses facial recognition software to surface friends of friends on the social network, where 100 million photos are uploaded each day.
Facebook alerts users when they've been tagged, and they can untag themselves at any time. The idea of facial recognition software makes some people nervous. Google can do this with its Google Goggles visual search application, but has declined to do so for fear of the privacy backlash. Facebook has the benefit of a closed social network of 550 million users at its fingertips, so it can make such a push with the right safeguards..."
From Justin Mitchell's blog article: "Unlike photos that get forgotten in a camera or an unshared album, tagged photos help you and your friends relive everything from that life-altering skydiving trip to a birthday dinner where the laughter never stopped. Tags make photos one of the most popular features on Facebook... We've been working to make this process easier for you. First we added group tagging, so you could type one name and apply it to multiple photos of the same person. Now we're announcing tag suggestions, which will make tagging multiple photos even more convenient.
Because photos are such an important part of Facebook, we want to be sure you know exactly how tag suggestions work: When you or a friend upload new photos, we use face recognition software -- similar to that found in many photo editing tools -- to match your new photos to other photos you're tagged in. We group similar photos together and, whenever possible, suggest the name of the friend in the photos. If for any reason you don't want your name to be suggested, you will be able to disable suggested tags in your Privacy Settings. Just click 'Customize Settings' and 'Suggest photos of me to friends.' Your name will no longer be suggested in photo tags, though friends can still tag you manually..."
http://www.eweek.com/c/a/Web-Services-Web-20-and-SOA/Facebook-Photos-Use-Facial-Recognition-for-Easier-Tagging-517287/
See also Justin Mitchell's Facebook Blog: http://blog.facebook.com/blog.php?post=467145887130
Data Alice Lipowicz, Government Computer News A new report from the U.S. President's Council of Advisors on Science and Technology declares that the Health and Human Services Department should help develop and promote a universal exchange language and infrastructure for patient health care information that would more readily enable the data to flow between disparate systems. "Extensible Markup Language (XML), which has a long track record of facilitating interoperability among systems, should become the basis for health IT as well, the advisory group says... Current electronic health care record systems are based on proprietary technologies, which make sharing difficult. XML would facilitate a new level of sharing, with metadata and common data elements that can be tagged with privacy and security restrictions, according to the report titled 'Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward' (108 pages). The universal language would be used by physicians, hospitals, researchers and public and private agencies to facilitate information sharing, states the 108-page report. The report calls on HHS' Centers for Medicare and Medicaid Services and the Office of the National Coordinator for Health Information Technology to develop guidelines to spur adoption of the universal exchange language that allows for transfer of patient health data while protecting privacy... Under the new exchange system, patient data would be divided into small individual pieces, which are tagged for attributes, provenance, and required security and privacy provisions. This allows for a more sophisticated model for data exchange and for protection of privacy..." http://gcn.com/articles/2010/12/09/health-it-should-build-on-xml-for-data-exchange-white-house-panel-says.aspx See also the text of the report: http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-health-it-report.pdf
Close
The U.S. Library of Congress (LOC) announced that a MADS/RDF ontology developed at LOC is available for a public review period until January 14, 2011. The MADS/RDF (Metadata Authority Description Schema in RDF) vocabulary is a data model for authority and vocabulary data used within the library and information science (LIS) community, which is inclusive of museums, archives, and other cultural institutions. It is presented as an OWL ontology. McCallum notes that this RDF may become the underlying format for important data sets like LCSH and LC name authorities.
Based on the MADS/XML schema, MADS/RDF provides a means to record data from the Machine Readable Cataloging (MARC) Authorities format in RDF for use in semantic applications and Linked Data projects. MADS/RDF is a knowledge organization system designed for use with controlled values for names (personal, corporate, geographic, etc.), thesauri, taxonomies, subject heading systems, and other controlled value lists.
MADS is closely related to SKOS, the Simple Knowledge Organization System and a widely supported and adopted RDF vocabulary. Unlike SKOS, however, which is very broad in its application, MADS/RDF is designed specifically to support authority data as used by and needed in the LIS community and its technology systems. Given the close relationship between the aim of MADS/RDF and the aim of SKOS, the MADS ontology has been fully mapped to SKOS.
Community feedback is encouraged and welcome. The MODS listserv for MADS/XML is maintained as part of the community work on MODS (Metadata Object Description Schema) and is the preferred forum for feedback...
MODS is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications..."
http://www.loc.gov/standards/mads/rdf/
See also the MADS XML Format for Authorities Data: http://www.loc.gov/standards/mads/
This article describes how recent developments in Web technology have affected the relationship between URI and resource representation and the related consequences. "Historically, URIs were mostly seen as simply the way you accessed Web pages. These pages were hand-authored, relatively stable and simply shipped out on demand. More and more often that is no longer the case... Insofar as there are definitive documents about all this, they all agree that URIs are, as the third initial says, identifiers, that is, names. They identify resources, and often (although not always) allow you to access representations of those resources. 'Resource' names a role in a story, not an intrinsically distinguishable subset of things, just as 'referent' does in ordinary language. Things are resources because someone created a URI to identify them, not because they have some particular properties in and of themselves. 'Representation' names a pair: a character sequence and a media type. The media type specifies how the character string should be interpreted. For example JPG or HTML or MP3 would be likely media types for representations of an image of an apple, a news report about an orchard or a recording of a Beatles song, respectively.
As long ago as the mid-1990s, information scientists had taken the URI-resource-representation split to its logical conclusion: it was OK to create URIs for resources for which no representation existed yet (for example a planned but not-yet-drafted catalogue entry), or even for resources for which no (retrievable) representation could in principle ever exist (a particular physical book, or even its author). By the end of the 1990s, the generalisation of the resource concept was complete, and we find, in the defining document for URIs (since superseded, but without significant change in this regard): 'A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., 'today's weather report for Los Angeles'), and a collection of other resources. Not all resources are network 'retrievable'; e.g., human beings, corporations, and bound books in a library can also be considered resources'.
Since then the principle that a URI can be used to identify anything; that is, that there are few if any limits on what can 'be a resource', has assumed more and more importance, particularly within one community, namely the participants in what is termed the Semantic Web programme.
This move is not just a theoretical possibility: there are more and more URIs appearing 'in the wild' which do not identify images, reports, home pages or recordings, but rather people, places and even abstract relations... What if we have a URI which identifies, let us say, not the Oaxaca weather report, but Oaxaca itself, that city in the Sierra Madre del Sur south-east of Mexico City? What should happen if we try to access that URI? If the access succeeds, the representation we get certainly will not reproduce Oaxaca very well: we will not be able to walk around in it, or smell the radishes if it happens to be 23 December.
This is the point at which the word 'representation' is a problem. Surely we can retrieve some kind of representation of Oaxaca: a map, or a description, or a collection of aerial photographs. These are representations in the ordinary sense of the word, but not in the technical sense it is used when discussing Web architecture. Unfortunately, beyond pointing to the kind of easy examples we have used all along (a JPG is a good representation of an image, a HTML document can represent a report very well, an MP3 file can represent a recording pretty faithfully), it is hard to give a crisp definition of what 'representation'
means in the technical sense... There is real debate underway at the moment as to exactly what it means for a Web server to return a 200 OK response code, and about exactly what kind of response is appropriate to a request for a URI which identifies a non-information resource.
This question arises because, particularly in the context of Semantic Web applications, although no representation of the resource itself may be available, a representation of an information resource which describes that resource may be available..."
.
http://www.ariadne.ac.uk/issue65/thompson-hs/
See also Architecture of the World Wide Web Volume One: http://www.w3.org/TR/webarch/
"Since 1999 the W3C has been working on a set of Semantic Web standards that have the potential to revolutionize web search. Also known as Linked Data, the Machine-Readable Web, the Web of Data, or Web 3.0, the Semantic Web relies on highly structured metadata that allow computers to understand the relationships between objects. Semantic web standards are complex, and difficult to conceptualize, but they offer solutions to many of the issues that plague libraries, including precise web search, authority control, classification, data portability, and disambiguation.
This article will outline some of the benefits that linked data could have for libraries, will discuss some of the non-technical obstacles that we face in moving forward, and will finally offer suggestions for practical ways in which libraries can participate in the development of the semantic web.
As Linked Data initiatives proliferate there has, unsurprisingly, been increased debate about exactly what we mean when we refer to Linked Data and the Semantic Web. Are the phrases interchangeable? Do they refer to a specific set of standards? A specific technology stack?
For the purposes of this paper we use the term 'Semantic Web' to refer to a full suite of W3C standards including RDF, SPARQL query language, and OWL web ontology language. As for 'Linked Data' we will accept the two part definition offered by the research team at Freie Universitat
Berlin: 'The Web of Data is built upon two simple ideas: First, to employ the RDF data model to publish structured data on the Web. Second, to [use http URIs] to set explicit RDF links between data items within different data sources'. We can see from this definition that Linked Data has two distinct aspects: exposing data as RDF, and linking RDF entities together...
http://www.dlib.org/dlib/november10/byrne/11byrne.html
See also Wikipedia on Linked Data: http://en.wikipedia.org/wiki/Linked_Data
"The Web is critical not merely to the digital revolution but to our continued prosperity -- and even our liberty. Like democracy itself, it needs defending... [When] the world wide web went live, on my physical desktop in Geneva, Switzerland, in December 1990, it consisted of one Web site and one browser, which happened to be on the same computer.
The simple setup demonstrated a profound concept: that any person could share information with anyone else, anywhere. In this spirit, the Web spread quickly from the grassroots up. Today, at its 20th anniversary, the Web is thoroughly integrated into our daily lives. We take it for granted, expecting it to 'be there' at any instant, like electricity.
The Web as we know it, however, is being threatened in different ways.
Some of its most successful inhabitants have begun to chip away at its principles. Large social-networking sites are walling off information posted by their users from the rest of the Web. Wireless Internet providers are being tempted to slow traffic to sites with which they have not made deals.
Why should you care? Because the Web is yours. It is a public resource on which you, your business, your community and your government depend.
The Web is also vital to democracy, a communications channel that makes possible a continuous worldwide conversation. The Web is now more critical to free speech than any other medium. It brings principles established in the U.S. Constitution, the British Magna Carta and other important documents into the network age: freedom from being snooped on, filtered, censored and disconnected...
Several principles are key to assuring that the Web becomes ever more valuable. The primary design principle underlying the Web's usefulness and growth is universality. When you make a link, you can link to anything. That means people must be able to put anything on the Web, no matter what computer they have, software they use or human language they speak and regardless of whether they have a wired or wireless Internet connection... Decentralization is another important design feature. You do not have to get approval from any central authority to add a page or make a link. All you have to do is use three simple, standard protocols: write a page in the HTML (hypertext markup language) format, name it with the URI naming convention, and serve it up on the Internet using HTTP (hypertext transfer protocol). Decentralization has made widespread innovation possible and will continue to do so in the future... A great example of future promise, which leverages the strengths of all the principles, is linked data..."
http://www.scientificamerican.com/article.cfm?id=long-live-the-web
See also the 2006 paper on linked data: http://www.w3.org/DesignIssues/LinkedData.html
"Identity matters: in everyday life we present different "faces" to different people according to the social context, e.g. family, personal, and professional. Our online life is the same, and our privacy depends on keeping these different faces compartmentalized. To support this, we need ways to restrict access to services.
A powerful way to implement this is with anonymous credentials. Imagine the student union providing electronic credentials to all students that asserts that you are a current student at that college/university. This is an electronic equivalent of a student ID card. When you go online to the social website operated by the student union, you are asked to prove you are a current student, but not for your actual identity.
I have been working with Patrik Bischel on an implementation of this approach based upon a Firefox extension and the open source idemix (identity mixer) library. The extension recognizes policy references in web page markup and asks the user for a PIN or pass phrase to unlock her credentials and construct a zero knowledge proof which is then sent to the website for verification...
This has been done with support from the EU PrimeLife project, and we hope to be able to make the extension and servlet widely available in the near future. Further work is needed on tools for simplifying the creation of credentials and proof specifications, and there are opportunities for integrating biometric techniques as alternatives to typing a PIN or pass phrase. One possibility would be for the browser to confirm your identity by taking a photo of your face with the camera built into phones and notebook computers..."
http://www.w3.org/QA/2010/11/boosting_privacy_online_-_anon.html
See also PRIME (Privacy and Identity Management for Europe): https://www.prime-project.eu/
"Over the past 30 years, the Internet has proven to be a revolutionary technology, so much so that it has run far beyond its original scope.
The original Internet was designed as a network to connect academic researchers. From these humble beginnings, it grew into a global communications network as integral to our lives as roads, telephones, and public utilities. Increasingly, it's taking on roles formerly performed by other infrastructures, such as mail (email), phones (voice over IP), television, and movies (streaming video). Simply put, it's the global network of the 21st century... In this article, we briefly review US and European approaches to addressing the future Internet's core infrastructure problems.
In Europe, technology companies, government, and academics have allied together to build the infrastructure for a new service economy that ties all facets of our personal and professional lives into one end-to-end system. The general European consensus is that the future Internet must meet a wide array of challenges and opportunities.... Europe regards the future Internet as a matter of public policy; consequently, government, academia, and businesses have banded together in a collaborative research agenda. The most prominent publicly funded US research efforts tend to focus on networking infrastructure and architecture. The notion of a clean-slate approach (redesigning the Internet from scratch) animates much of the research in the US... According to David Clark, a senior research scientist at the MIT Computer Science and Artificial Intelligence Laboratory, US government funding agencies tend to support research into long-term questions and put less emphasis on commercial applications.
In Europe, funding is often associated with ministries of trade or commerce and thus seeks commercial relevance. In the US, government- funded initiatives focus on architectural questions, and there's less talk of a holistic model of the future Internet. Europeans tend to consider more regulatory questions up front. According to Clark, US researchers tend to leave short-term commercial applications to the private sector (indeed, some of the biggest US Internet successes such as Amazon, Facebook, and Google were the result of smart entrepreneurs) and let the government step in later to clean up the mess...
http://www.computer.org/cms/Computer.org/ComputingNow/homepage/2010/1110/W_IC_IsEuropeLeading.pdf
See also IEEE Internet Computing: http://www.computer.org/portal/web/computingnow/internetcomputing
According to Avram, "54% of the video published on the Internet is currently available in the HTML5 format, according to MeFeedia, and new HTML5 editing tools are announced by Adobe and Sencha, showing that HTML5 is taking off.
MeFeedia, an online video portal, has conducted a study to find out how much HTML5 content is out there. Having a video index with millions of entries and covering over 33,000 video publishers, the study concluded that online video content available as HTML5 has doubled over the past 5 months from 26% to 54%, and it has grown 5 times since the beginning of the year when it was only 10%...
In the same time, new visual HTML5 editors are announced. One of them is Edge from Adobe, a prototype tool addressed to Photoshop, Illustrator and Flash Pro users for creating HTML5 animations...
Another tool recently announced is Sencha Animator, a GUI-based editor for interactive designers interested in creating HTML5 animations.
Animator was created with Ext JS, a cross-browser JavaScript library providing widgets for RIA applications, and it generates pure CSS3 animation code working with any JS library..."
http://www.infoq.com/news/2010/10/HTML5-Is-Taking-Off
See also the HTML5 Draft: http://dev.w3.org/html5/spec/Overview.html
"Glenn Block, a Windows Communication Foundation (WCF) Program Manager, reported during an online webinar 'WCF, Evolving for the Web' that Microsoft's framework for building service-oriented applications is going to be refactored radically, the new architecture being centered around HTTP. Block started the online session by summarizing the current trends in the industry: a move to cloud-based computing; a migration away from SOAP; a shift towards browsers running on all sorts of devices; an increase in the adoption of REST; emerging standards like OAuth, WebSockets...
One of the key features of WCF is support for multiple transports (HTTP, TCP, named-pipes) under the same programming model. Unfortunately when it comes to HTTP, a lot of HTTP goodness (scale, content negotiation) is lost because WCF treats it as a transport. So Block is looking forward to see WCF supporting HTTP as a first class application protocol with simple and flexible programming model..
WCF will contain helper APIs for pre-processing HTTP requests or responses, doing all the parsing and manipulation of arguments, encapsulating the HTTP information in objects that can be later transferred for further processing. This will relieve the user from dealing with HTTP internals directly if he wants to. This feature will also present a plug-in capability for media-type formatters of data formats like JSON, Atom, OData, etc. WCF will support some of them out of the box, but the user will be able to add his own formatters.
We asked Glenn Block what it is going to happen to the other protocols, especially SOAP. His answer was that WCF is going to fully support the existing stack, and the current development is meant to evolve WCF to fully support HTTP without renouncing to anything WCF has so far..."
http://www.infoq.com/news/2010/10/WCF-REST
See also on Windows Communication Foundation: http://msdn.microsoft.com/en-us/netframework/aa663324.aspx
"Programmers and academics often think and theorize about XML as kind of tree data structure. And so indeed it is. But it is also allows much more: it is a series of different graph structures composed into or imposed on top of that tree... Many people only use the tree structures in XML by choice (Content, Attributes, Elements and
Comments: CACE?), and then find themselves having to revisit and perhaps reinvent the same kinds of data structures provided by the bits of XML they don't use. This is neither good nor bad: it depends on the case. I am not saying that XML IDs or entities are perfect or imperfect, for example: merely that they provide since they spring out of solving particular problems, we can see them as one way of revealing a general problem or solution space...
So [I provide here] a table showing what I think are four different simultaneous data structures that are available in vanilla XML. In this view, XML is: (1) The elements in XML form a tree (a single-rooted, directed, acyclic graph with no shared nodes). And a particular kind of tree: an ordered, typeable tree with labelled nodes that can have properties (a kind of attribute-value tree?) and unique identifiers, and with unlabelled text with no properties possible as leaves, but whose edges are unlabelled and have no properties. (2) Imposed on this we have a graph structure made using the ID/IDREF links. (3) Underneath the elements, the document is composed as a structure of parsed entities.
Most XML documents are made of a single entity: one file or one web resource of course, but the entity mechanism allows a document to be constructed from multiple sources of text. These parsed entities form an acyclic directed ordered graph. (4) Then, above this are links that point outside the document (again using the entity mechanism): non-XML entities such as graphics form a star, XML documents form a graph (e.g., the Web), and I've tacked in for completeness the old SGML SUBDOC feature which allows documents to be nested...
Most of the built-in XML layers have more ambitious re-workings: instead of XML parseable entities you may be able to use XInclude, instead of ID/IDREF you may find XSD's KEY/KEYREF better (but probably not), instead of external entities you may find XLinks (for navigation) or RDF (for semantic links) better. No-one would (or could) use SUBDOC but would go for XML Namespaces plus XInclude...
The thought comes as to whether some of the additional XML standards are attempts to convert more of the Ns into Ys, or at least whether it would be a more rational approach to improving XML (or its successor's) expressive power while keeping things simple. For example, XML Schemas has its own composition mechanism (xs:include) and its own hierarchy mechanism (type derivation), its own internal linking system (global/local types) and an external linking system (import). But in the composition layer, how to handle circular inclusions or multiple inclusions (graphs or acyclic graphs) is necessarily defined..."
http://broadcast.oreilly.com/2010/10/under-estimating-xml-as-just-a.html
From Google: "Most of the common image formats on the web today were established over a decade ago and are based on technology from around that time. Some engineers at Google decided to figure out if there was a way to further compress lossy images like JPEG to make them load faster, while still preserving quality and resolution. As part of this effort, we are releasing a developer preview of a new image format, WebP, that promises to significantly reduce the byte size of photos on the web, allowing web sites to load faster than before.
Images and photos make up about 65% of the bytes transmitted per web page today. They can significantly slow down a user's web experience, especially on bandwidth-constrained networks such as a mobile network.
Images on the web consist primarily of lossy formats such as JPEG, and to a lesser extent lossless formats such as PNG and GIF. Our team focused on improving compression of the lossy images, which constitute the larger percentage of images on the web today.
To improve on the compression that JPEG provides, we used an image compressor based on the VP8 codec that Google open-sourced in May 2010.
We applied the techniques from VP8 video intra frame coding to push the envelope in still image coding. We also adapted a very lightweight container based on RIFF. While this container format contributes a minimal overhead of only 20 bytes per image, it is extensible to allow authors to save meta-data they would like to store.
While the benefits of a VP8 based image format were clear in theory, we needed to test them in the real world. In order to gauge the effectiveness of our efforts, we randomly picked about 1,000,000 images from the web (mostly JPEGs and some PNGs and GIFs) and re-encoded them to WebP without perceptibly compromising visual quality. This resulted in an average 39% reduction in file size. We expect that developers will achieve in practice even better file size reduction with WebP when starting from an uncompressed image..."
http://blog.chromium.org/2010/09/webp-new-image-format-for-web.html
See also the WebP Google Code site: http://code.google.com/speed/webp/
Members of the W3C RDFa Working Group have released a First Public Working Draft for the specification "RDFa API: An API for Extracting Structured Data from Web Documents." RDFa, as defined in "RDFa Core
1.1: Syntax and Processing Rules for Embedding RDF Through Attributes", specifies the use of attributes "to express structured data in any markup language. The embedded data already available in the markup language (e.g., XHTML) is reused by the RDFa markup, so that publishers don't need to repeat significant data in the document content.
RDFa enables authors to publish structured information that is both
human- and machine-readable. Concepts that have traditionally been difficult for machines to detect, like people, places, events, music, movies, and recipes, are now easily marked up in Web documents. While publishing this data is vital to the growth of Linked Data, using the information to improve the collective utility of the Web for humankind is the true goal.
To accomplish this goal, it must be simple for Web developers to extract and utilize structured information from a Web document. This RDFa API document details such a mechanism -- an RDFa Application Programming Interface (RDFa API) that allows simple extraction and usage of structured information from a Web document... A document that contains RDFa effectively provides two data layers. The first layer is the information about the document itself, such as the relationship between the elements, the value of its attributes, the origin of the document, and so on, and this information is usually provided by the Document Object Model, or DOM. The second data layer comprises information provided by embedded metadata, such as company names, film titles, ratings, and so on, and this is usually provided by RDFa, Microformats, DC-HTML, GRDDL, or Microdata..."
The mission of the RDFa Working Group, part of the Semantic Web Activity is to "support the developing use of RDFa for embedding structured data in Web documents in general. The Working Group will publish W3C Recommendations to extend and enhance the currently published RDFa 1.0 documents, including an API. The Working Group will also support the HTML Working Group in its work on incorporating RDFa in HTML5 and XHTML5."
http://www.w3.org/TR/2010/WD-rdfa-api-20100923/
See also the W3C RDFa Working Group: http://www.w3.org/2010/02/rdfa/
The Internet Engineering Task Force (IETF) has published an initial working draft of the Information document "Internet Media Types and the Web." Initial dicussion of some issues has taken place within the W3C Technical Architecture Group (TAG) on a related discussion list.
From the Abstract: "This document describes some of the ways in which parts of the MIME system, originally designed for electronic mail, have been used in the web, and some of the ways in which those uses have resulted in difficulties. This informational document is intended as background and justification for a companion Best Current Practice which makes some changes to the registry of Internet Media Types and other specifications and practices, in order to facilitate Web application design and standardization.
The goal of the document is to prompt an evolution within W3C and IETF over the use of MIME (and in particular Internet Media Types) to fix some of the outstanding problems. This is an initial version review and update. The goal is to initially survey the current situation and then make a set of recommendation to the definition and use MIME components (and specifically, Internet Media Types and charset
declarations) to facilitate their standardization across Web and Web-related technologies with other Internet applications.
MIME was invented originally for email, based on general principles of 'messaging', a foundational architecture framework. The role of MIME was to extend Internet email messaging from ASCII-only plain text, to include other character sets, images, rich documents, etc.) The basic architecture of complex content messaging is: (1) Message sent from A to B. (2) Message includes some data. Sender A includes standard 'headers' telling recipient B enough information that recipient B knows how sender A intends the message to be interpreted. (3) Recipient B gets the message, interprets the headers for the data and uses it as information on how to interpret the data... [But now] The 'Internet Media Type registry' (MIME type registry) is where someone can tell the world what a particular label means, as far as the sender's intent of how recipients should process a message of that type, and the description of a recipients capability and ability for senders... The differences between the use of Internet Media Types between email and HTTP were minor (default charset; requirement for CRLF in plain text) but these minor differences have caused a lot of trouble...
Additional considerations treated in the presentation: There are related problems with charsets; Embedded, downloaded, launch independent application; Additional Use Cases: Polyglot and Multiview; Evolution, Versioning, Forking; Content Negotiation; Fragment identifiers..."
http://xml.coverpages.org/draft-masinter-mime-web-info-00.txt
See also MIME and the Web directions: http://lists.w3.org/Archives/Public/www-tag/2010Sep/0027.html
Almost one year ago, I wrote about browser testing. The idea started with the fact that W3C has a number of Working Groups who are trying to review the way they do testing, but also increase the number of tests they are doing as well. Since then, the situation improved a bit but we're still far from reaching an appropriate comfort level.
There is now a new Mercurial (version control) server and you will find tests related to HTML5 or Web Applications. All of those tests are automatically mirrored on the site 'test.w3.org'. The Mobile tests are also linked from there. The CSS and MathML test suites are still in a separate space. The SVG Group and Internationalization effort are in a transition phase.
A few of our Groups have documented how to contribute tests: the CSS Working Group, the HTML Working Group, and the SVG Working Group.
The Internationalization effort has been reporting on test results for while, such as for language declarations. The HTML Working Group, working on HTML5, also started to publish ongoing test results. Thanks to a few contributions, we started to test a some features, including video and canvas...
The HTML test suite only contains 97 approved tests for the moment so don't draw too may conclusion from the result tabler. The number of tests needs to increase significantly if we want to test HTML5 properly. Around 900 tests are waiting to be approved within the task force but we're lacking participants. Help in identifying more test sets and submitting them to the group would also be appreciated. The Web browsers of tomorrow are being developed and tested today, so don't wait and help us make the Web a better place! So pick your favorite HTML, CSS, SVG, MathML, or API feature, write as many tests as you can on it, and submit those tests to us. Before sending bug reports to various browser vendors, that's your best chance to get your favorite feature properly implemented and the browser developers will even thank you..."
http://www.w3.org/QA/2010/09/how_do_we_test_a_web_browser_o.html
See also the example HTML5 conformance tests: http://test.w3.org/html/tests/reporting/report.htm
Members of the W3C Voice Browser Working Group have published an updated Working Draft for "Voice Extensible Markup Language (VoiceXML) Version 3.0." Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. In this Working Draft a 'Revised Legacy' profile description is provided to match current WG thinking, Section 5.4 'SIV Resource' is removed (since it is now covered along with the recognition resource in section), and the Event Model of Section 4.4 had been revised to match the WG members' current thinking about DOM events as the underlying model for all flow control. Open issues are highlighted in the diff-marked version of the specification.
VoiceXML 3.0 "explains the core of VoiceXML 3.0 as an extensible framework -- how semantics are defined, how syntax is defined and how the two are connected together. In this document, the "semantics" are the definitions of core functionality, such as might be used by an implementer of VoiceXML 3.0. The definitions are represented as English text, SCXML syntax, and/or state chart diagrams. The term "syntax" refers to XML elements and attributes that are an application author's programming interface to the functionality defined by the "semantics".
Within the Core document, all the functionality of VoiceXML 3.0 is grouped into modules of related capabilities. Modules can be combined together to create complete profiles (languages). This document describes how to define both modules and profiles. In addition to describing the general framework, this document explicitly defines a broad range of functionality, several modules and two profiles...
In Version 3.0, the Voice Browser Working Group has developed the detailed semantic descriptions of VoiceXML functionality that versions 2.0 and 2.1 lacked. The semantic descriptions clarify the meaning of the VoiceXML 2.0 and 2.1 functionalities and how they relate to each other. Detailed semantics for new functionality are now defined: new functions include, for example, speaker identification and verification, video capture and replay, and a more powerful prompt queue. These semantic descriptions for these new functions are also represented in this document as English text, UML state chart visual diagrams and/or textual SCXML representations... Organization of functionality into modules makes it easier to understand what happens when modules are combined or new ones are defined. In contrast, VoiceXML 2.0 and 2.1 had a single global semantic definition (the FIA), which made it difficult to understand what would happen if certain elements were removed from the language or if new ones were added..."
http://www.w3.org/TR/2010/WD-voicexml30-20100831/
See also the W3C Voice Browser Activity FAQ document: http://www.w3.org/Voice/#faq
"Context, which is the environment or situation surrounding a particular target, is a critical component of federal data architectures that needs to be planned and implemented before an incident occurs in which it is needed. This article examines three information management projects in which context plays a key role in the solution...
Given that modern development platforms can automatically generate code to process XML documents, a narrow perspective can affect the exchange and any code that processes that exchange. The new approach being spearheaded by forward-thinking elements of the Army and Air Force is to create the semantics first, via a high-fidelity data model called an ontology, and then generate the XML schemas from that model.
Although not based on the Web Ontology Language, the National Information Exchange Model (NIEM) takes a similar approach, in which the XML schemas are generated from a database-backed data model. The contextual nature of this approach is that the ontology uses a more top-down, enterprise perspective to guide the inclusion of bottom-up exchanges. The heightened awareness and use of context were mirrored on the commercial front by Google's purchase of Metaweb and the company's Freebase entity graph.
The elevation of context in our information management activities is a sign of a more aggressive attitude toward actively managing our data so that we can take advantage of its potential. The key to mastering context is to understand the role of metadata in your organization and how to effectively design it... metadata captures context, whereas your data is content..."
http://gcn.com/articles/2010/08/16/daconta-context-in-info-sharing.aspx
See also the National Information Exchange Model: http://www.niem.gov/
"Open Text Corporation, a preeminent provider of enterprise content management (ECM) software, has announced Open Text Semantic Navigation as an innovative tool that helps audiences naturally navigate through volumes of information based on the inherent meaning of the content and increasing Web marketing and online search effectiveness. Available as a cloud-offering or on-premise, Semantic Navigation gives organizations an easy way to improve engagement with online audiences.
At the core of the offering is the Open Text Content Analytics engine that intelligently extracts meaning, sentiment and context from content, and in turn marries that content to what a customer or prospect is looking for on a website. The result is that audiences more consistently and quickly find helpful, valuable information with much less effort.
A complete solution, Open Text Semantic Navigation is designed to complement any existing Web site, independent of the Web content management system used, either installed on local servers or as an online service provided by Open Text. With the cloud-based offering (currently in beta), organizations can rapidly and inexpensively upgrade their sites user experience using a free, fully functional 30-day trial.
Once the trial is activated, Semantic Navigation first collects content through a crawling process. Then the content is automatically analyzed and tagged with relevant and insightful entities, topics, summaries and sentiments the key to providing an engaging online experience.
Next, content is served to users through intuitive navigation widgets that encourage audiences to discover the depth of available information or share it on social networks, such as Facebook and Twitter. From there, Semantic Navigation supports placement of product and service offerings or advertising to convert page views into sales...."
http://www.opentext.com/2/global/press-release-details.html?id=2395
See also the Open Text Semantic Navigation Cloud: http://www.semanticnavigation.opentext.com/
An updated XML schema has been published by developers in the Music Encoding Initiative (MEI) project. The Music Encoding Initiative (MEI) schema "is a set of rules for recording the intellectual and physical characteristics of music notation documents so that the information contained in them may be searched, retrieved, displayed, and exchanged in a predictable and platform-independent manner. The schema is provided in both RelaxNG (RNG) and W3C (XSD) schema forms. Both versions consist of a driver file (mei-all), modules (such as analysis, cmnOrnaments,
etc.) and auxiliary files (defaultClassDecls and datatypes)...
Music Encoding Initiative strives to create a semantically rich model for music notation that: (1) accommodates the encoding of common Western music, but is not limited to common music notation; (2) is designed by the scholarly community for scholarly uses, but does not exclude other uses; (3) provides for the common functions of traditional facsimile, critical, and performance editions; (4) has a modular structure that permits use dependent on the goals of scholars; and (5) is based on open standards and is platform-independent; (6) employs XML technologies;
(7) permits the development of comprehensive and permanent international archives of notated music as a basis for editions, analysis, performances, and other forms of research...
As a natural-language translation of the MEI schema, the tag library conveys information about the three principal tasks accomplished by the schema. First, the schema breaks down the content of music notation documents into data fields or categories of information called 'elements'. All of these elements are named, defined, and described in the MEI Tag Library. Second, the tag library identifies and defines attributes associated with those elements. Attributes are characteristics or properties that further refine the element. Last, and perhaps most importantly, the tag library expresses the schema structure by explaining the relationship between elements, specifying where the elements may be used and describing how they may be modified by attributes...
Contributors are involved in several MEI Projects. The MEI editor currently in development aims to enable the users to view and graphically edit MEI-encoded music documents in CWN (Common Western Notation). The supported function set is only a subset of the features contained in MEI, excluding more exotic features such as medieval neumes and features that are graphically complex to implement... One of the editors of the new edition of Haydn's complete works, reports on the possibilities that MEI offers for scholarly music editions, using partial encodings of two aria arrangements made by Joseph Haydn for the Esterhazy court..."
http://music-encoding.org/
See also earlier references for XML and Music: http://xml.coverpages.org/xmlMusic.html
"The challenges of choosing a Web browser are greater now because the browser is becoming the home for almost everything we do. Do you have documents to edit? There's a website for that. Did you miss a television show? There's a website for that. Do you want to announce your engagement?
There's a website for that too. The Web browser handles all of that and more... On one hand, the programs are as close to commodities as there are in the computer industry. The core standards are pretty solid and the job of rendering the document is well understood. Most differences can be smoothed over when the Web designers use cross-platform libraries like jQuery...
It's easy for a programmer to be enthusiastic about Google's Chrome 5.0 because Google has been emphasizing some of the things that programmers love. Chrome sticks each Web page in a completely separate process, which you can see by opening up Windows Task Manager. If some Web programmer creates an infinite loop or a bad AJAX call in a Web page, Chrome isolates the trouble. Your other pages can keep on running... Best for: People who want to juggle many windows filled with code that crashes every so often. Worst for: People who get upset when a website breaks because the developer tested the site on IE only...
The old Netscape died years ago, but somehow it begat the Firefox browser that gave us many of the innovations being copied by IE 6 and others...
One of the strengths of Firefox continues to be the large collection of extensions and plug-ins. These can all be written in a mixture of JavaScript, CSS, and HTML, something that makes them a bit easier for the average Web developer to tackle. In contrast, Microsoft's add-ons can be written in C++. Firefox add-ons like Greasemonkey make it even easier to write simple scripts that meddle with the DOM of incoming data, a nice playpen for creating your own quick add-ons... Best for: People who enjoy the wide-open collection of extensions. Worst for: People who write long-running scientific simulations in JavaScript...
Microsoft Internet Explorer 9.0 beta: continues to dominate, thanks to the fact that it may or may not be integrated with the Windows operating system, depending upon the political winds. Microsoft noticed the erosion from total world domination several years ago and is now rapidly adopting some of the best features from the alternatives.. E9 now offers many of the features that drew me to other browsers. There's a nicer developer tool for debugging JavaScript, and the speed is catching up to the others... Best for: People who don't care or don't want to care.
IE is still the most likely to work with most websites. Worst for:
People who worry about browser-based attacks and those who want to try the latest HTML5.... [As for Opera 10.60] - Best for: Raw speed and innovation. Worst for: People who can't imagine straying from the pack...
[As for Apple Safari 5.0] - Best for: Web developers who want to support WebKit phones. Worst for: Lovers of extensions and add-ons..."
http://www.infoworld.com/d/applications/the-best-web-browser-chrome-firefox-internet-explorer-opera-or-safari-516
"At the just-concluded Balisage conference, Michael Sperberg-McQueen brought up the (apparently) famous 'worse is better' essay by Richard P. Gabriel... Gabriel's original argument is essentially that software that chooses simplicity over correctness and completeness has better survivability for a number of reasons, and cites as a prime example Unix and C, which spread precisely because they were simple (and thus easy to port) in spite of being neither complete functionally nor consistent in terms of their interfaces (user or programming). Gabriel then goes on, over the years, to argue against his own original assertion that worse is better and essentially falls into a state of oscillation between 'yes it is' and 'no it isn't'...
Thinking then about 'worse is better' and Gabriel's inability to decide conclusively if it is actually better got me to thinking and the conclusion I came to is that the reason Gabriel can't decide is because both sides of his dichotomy are in fact wrong. In the New Jersey approach, 'finished' is defined by the implementors with no obvious reference to any objective test of whether they are in fact finished.
At the same time, the MIT approach falls into the trap that agile methods are designed explicitly to avoid, namely overplanning and implementation of features that may never be used...
Both the MIT and New Jersey approaches ultimately fail because they are not directly requirements driven in the way that agile methods are and must be. Or put another way, the MIT approach reflects the failure of overplanning and the New Jersey approach reflects the failure of underplanning. Agile methods, as typified by Extreme Programming, attempt to solve the problem by doing just the right amount of planning, and no more, and that planning is primarily a function of requirements gathering and validation in the support of iteration.
To that degree, agile engineering is much closer to the worse is better approach, in that it necessarily prefers simplicity over completeness and it tends, by its start-small-and-iterate approach, to produce smaller solutions faster than a planning-heavy approach will..."
http://drmacros-xml-rants.blogspot.com/2010/08/worse-is-better-or-is-it.html
See also the early 1991 paper: http://dreamsongs.com/WIB.html
This presentation by Eric Sachs (Senior Product Manager, Google Security) was given at the 2010 Cloud Identity Summit. The summit included 'Dissecting Cloud Identity Standards', where "Secure internet identity infrastructure requires standard protocols, interfaces and API's. The summit goals were to help make sense of the alphabet soup presented to end-users, including OpenID, SAML, SPML, ACXML, OpenID, OIDF, ICF, OIX, OSIS, Oauth IETF, Oauth WRAP, SSTC, WS-Federation, WS-SX (WS-Trust), IMI, Kantara, Concordia, Identity in the Clouds (new OASIS TC), Shibboleth, Cloud Security Alliance and TV Everywhere...
Sachs' paper overviews Google's goals in identity services to increase growth, and provide a more seamless user Google provides federated identity services for over 2 millions businesses and hundreds of millions of users. He explains why Google has made such a large investment in technologies such as OpenID & OAuth, and how consumer websites and enterprise oriented websites are connecting experience.... [Excerpts:] Broad Net-wide goals are to (1) Reduce friction on the Internet by:
improving collaboration between users, especially between companies; promoting data sharing between users and their service providers; enhancing user experience through personalization and increased signup rates (2) Increase user confidence in security of the Internet, by reducing password proliferation re-use across sites; promoting high adoption of multi-factor authentication; advancing user/enterprise controlled data-sharing...
As to eliminating passwords by using Open Standards: No one company can do this on their own. Consistency in User Interface/Experience is critical. Support from major players is a must (Microsoft, Facebook, Google, Yahoo, AOL, etc.). The solution must support not just consumers, but also small/medium sized businesses and enterprises, and the solution must work globally. It's is not just web apps: must support iPhone apps, POP/IMAP apps, Windows apps, Mac apps, Linux apps, Blackberry apps, etc. If the app's website has no password for the user, what does user type in the login box? It's the same problem as OpenID and SAML. On a web login page, we redirect via SAML/OpenID. What do you do from a login page that is not in a web-browser?
Multi-factor authentication unlocks market for multi-factor auth vendors, especially mobile phone/network providers; usability is greatly improved by linking a user/employee's single identity provider with muli-factor authentication..."
http://www.cloudidentitysummit.com/upload/PingKeynote_Eric-sachs.pdf
See also the online Summit presentations: http://www.cloudidentitysummit.com/Presentations-2010.cfm
Members of the W3C Multimodal Interaction Working Group have published a second public working draft for "Emotion Markup Language (EmotionML) Version 1.0." Abstract: "As the web is becoming ubiquitous, interactive, and multimodal, technology needs to deal increasingly with human factors, including emotions. The present draft specification of Emotion Markup Language 1.0 aims to strike a balance between practical applicability and scientific well-foundedness. The language is conceived as a 'plug-in'
language suitable for use in three different areas: (1) manual annotation of data; (2) automatic recognition of emotion-related states from user behavior; and (3) generation of emotion-related system behavior."
As for any standard format, the first and main goal of an EmotionML is
twofold: to allow a technological component to represent and process data, and to enable interoperability between different technological components processing the data.
Use cases for EmotionML can be grouped into three broad types: [A] Manual annotation of material involving emotionality, such as annotation of videos, of speech recordings, of faces, of texts, etc; [B] Automatic recognition of emotions from sensors, including physiological sensors, speech recordings, facial expressions, etc., as well as from multi-modal combinations of sensors; [C] Generation of emotion-related system responses, which may involve reasoning about the emotional implications of events, emotional prosody in synthetic speech, facial expressions and gestures of embodied agents or robots, the choice of music and colors of lighting in a room, etc. Interactive systems are likely to involve both analysis and generation of emotion-related behavior; furthermore, systems are likely to benefit from data that was manually annotated, be it as training data or for rule-based modelling. Therefore, it is desirable to propose a single EmotionML that can be used in all three contexts.
Concrete examples of existing technology that could apply EmotionML
include: (1) Opinion mining / sentiment analysis in Web 2.0, to automatically track customer's attitude regarding a product across blogs;
(2) Affective monitoring, such as ambient assisted living applications for the elderly, fear detection for surveillance purposes, or using wearable sensors to test customer satisfaction; (3) Character design and control for games and virtual worlds; (4) Social robots, such as guide robots engaging with visitors; (5) Expressive speech synthesis, generating synthetic speech with different emotions, such as happy or sad, friendly or apologetic; (6) Emotion recognition (e.g., for spotting angry customers in speech dialog systems); (7) Support for people with disabilities, such as educational programs for people with autism..."
http://www.w3.org/TR/2010/WD-emotionml-20100729/
See also the W3C Multimodal Interaction Activity: http://www.w3.org/2002/mmi/
The Model-View-ViewModel (MVVM) design pattern describes a popular approach for building WPF and Silverlight applications. It's both a powerful tool for building applications and a common language for discussing application design with developers. While MVVM is a really useful pattern, it's still relatively young and misunderstood. In this article the author explains how the ViewModel works, and discusses some benefits and issues involved in implementing a ViewModel in your code. He also walks through some concrete examples of using ViewModel as a document manager for exposing Model objects in the View layer.
When is the MVVM design pattern applicable, and when is it unnecessary?
How should the application be structured? How much work is the ViewModel layer to write and maintain, and what alternatives exist for reducing the amount of code in the ViewModel layer? How are related properties within the Model handled elegantly? How should you expose collections within the Model to the View? Where should ViewModel objects be instantiated and hooked up to Model objects?
The Model [is often] core of the application, and a lot of effort goes into designing it according to object-oriented analysis and design
(OOAD) best practices. For me the Model is the heart of the application, representing the biggest and most important business asset because it captures all the complex business entities, their relationships and their functionality. Sitting atop the Model is the ViewModel. The two primary goals of the ViewModel are to make the Model easily consumable by the WPF/XAML View and to separate and encapsulate the Model from the View. These are excellent goals, although for pragmatic reasons they're sometimes broken...
You build the ViewModel knowing how the user will interact with the application at a high level. However, it's an important part of the MVVM design pattern that the ViewModel knows nothing about the View. This allows the interaction designers and graphics artists to create beautiful, functional UIs on top of the ViewModel while working closely with the developers to design a suitable ViewModel to support their efforts.
In addition, decoupling between View and ViewModel also allows the ViewModel to be more unit testable and reusable... The MVVM desig n pattern is a powerful and useful pattern, but no design pattern can solve every issue; combining the MVVM pattern and goals with other patterns, such as adapters and singletons, while also leveraging new .NET Framework 4 features, such as dynamic dispatch, can help address many common concerns around implementing the MVVM design pattern..."
http://msdn.microsoft.com/en-us/magazine/ff798279.aspx
"W3C has announced the release of Unicorn, a one-stop tool to help people improve the quality of their Web pages. Unicorn "combines a number of popular tools in a single, easy interface, including the Markup validator, CSS validator, mobileOk checker, and Feed validator, which remain available as individual services as well.
What is Markup Validation? Most pages on the World Wide Web are written in computer languages such as XML, HTML, XHTML, etc. that allow Web authors to structure text, add multimedia content, and specify what appearance, or style, the result should have. As for every descriptive markup language, these have their own grammar, vocabulary and syntax, and every document written with these computer languages are supposed to follow these rules. The (X)HTML languages, for all versions up to XHTML 1.1, are using machine-readable grammars called DTDs, a mechanism inherited from SGML. However, Just as texts in a natural language can include spelling or grammar errors, documents using Markup languages may (for various reasons) not be following these rules. The process of verifying whether a document actually follows the rules for the
language(s) it uses is called validation, and the tool used for that is a validator. A document that passes this process with success is called 'valid'.
The basic mechanism of Unicorn is thus: after receiving a request from a user (via The Unicorn User Interface), the Unicorn Framework creates and sends a sequence of observation requests to a number of observers.
The observers (validators, checkers, anything...) perform, and then report their observations (a list of errors, warnings or information) back to the framework in their observation response. The framework gathers and processes all observation responses, and displays the final result for user consumption... It uses Tasks such as "validation", "conformance checking" or "check for broken links". Each task is internally known by the framework as a sequence of observations, with different priorities given to different observations. If a high level priority observation returns one or more errors (e.g a well-formedness error or invalid markup), lower priority observations will not be requested, and the user will only be sent the results of the observations already processed...
W3C invites developers to enhance the service by creating new modules and testing them in our online developer space, or installing Unicorn locally. W3C looks forward to code contributions from the community as well as suggestions for new features. For example, users can develop a new 'Observer', where an observer is composed of two parts: a RESTful Web Service capable of sending results in a format that Unicorn can understand (you can download the XML schema file of Unicorn responses from the Unicorn public repository) and a file describing the service's capabilities, called the contract... W3C would like to thank the many people whose work has led up to this first release of Unicorn. This includes developers who started and improved the tool over the past few years, users who have provided feedback, translators who have helped localize the interface with 21 translations so far, and sponsors HP and Mozilla and other individual donors.
http://www.w3.org/News/2010#entry-8862
See also the InfoQueue article by Dave West: http://www.infoq.com/news/2010/07/web-unicorn-validation
"W3C is organizing a 'Workshop: The Multilingual Web - Where Are We?'
to take place 26-27 October 2010 in Madrid, Spain. Workshop participants will survey and introduce currently available best practices and standards that help content creators, localizers, language technology developers, browser makers, and others meet the challenges of the multilingual Web.
The Workshop also provides opportunities for networking that span the various communities involved in enabling the multilingual Web.
Participation is free and open to anyone. However, space is limited and participants must send an expression of interest to the program committee. People wishing to speak should also submit a presentation outline as soon as possible.
This is the first of four Workshops being planned by W3C over the next two years as part of the MultilingualWeb European Project. The first Workshop is hosted by the Universidad Politécnica de Madrid.
The workshop is expected to attract a broad set of stakeholders, including managers and practitioners working in the areas of content development, design, localization, and production management; developers of tools such as translation tools, content management systems, editors, etc; researchers and developers working with language technology and resources; browser implementors; standards and industry body representatives; and many more. The interchange of information and perspectives from this diverse group is expected to provide a more thorough picture of the existing landscape for multilingualism on the Web...."
http://www.w3.org/International/multilingualweb/madrid/cfp
See also the W3C Internationalization (I18n) Activity: http://www.w3.org/International/
Mobile devices and platforms boast more features and functionality with each new release, and often mere months separate significant announcements from the leading mobile vendors. The headlines are mostly about UI features and hardware enhancements such as processor speed and storage capacity. But the crucial fact remains that content is king.
Content (or, more generally, data) is exchanged constantly among applications, servers, mobile devices, and users. Without being able to work with it, smartphones such as Apple's iPhone and Google's Android simply become overpriced and underperforming cell phones.
Consider the phenomenal success of social-networking platforms such as Facebook, LinkedIn, and Twitter. From a pure feature-and-function perspective, these platforms are largely pedestrian. They are popular because members and site visitors derive value from the content published there. And that content is accessed increasingly by mobile devices.
This article demonstrates the use of XML and JSON data-interchange formats on the Android platform. The source of the data for the example application is a status-update feed for a Twitter account. The feed data is available from Twitter in both XML and JSON formats. As you see, the programming approach to manipulating the data varies significantly between the two formats... Compared to the JSON approach, the XML approach is somewhat faster and less memory-constrained -- at the expense of additional complexity. In Part 2, I'll introduce some advanced techniques combining JSON data, WebKit-based WebView widgets, and custom dynamic application logic for Android applications..."
http://www.ibm.com/developerworks/xml/library/x-andbene1/
"While most emerging technologies tend to happen quickly, one that has been 'emerging' for a really long time is the Semantic Web. However, Google's recent acquisition of Metaweb may be the signal that the Semantic Web has finally arrived... Most of the key standards such as RDF for tagging, OWL for setting ontologies and Sparql for handling queries, are now in place. In recent years, Metaweb's Freebase has emerged as an example of the power of semantic technologies. Freebase creates structured data out of concepts from sites around the web, such as Wikipedia, and makes it very simple to query and use that data.
In spirit, Freebase is definitely a Semantic Web project, since it is concerned with the semantics of content on the web. However, it isn't a pure semantic web project, since, while it does use standards such as RDF and OWL, it does not natively use Sparql as a query engine. In comparison, the similar DBPedia project does use Sparql and most semantic web standards. But outside of these issues, Metaweb's Freebase is definitely semantic, and the fact that Google has acquired the company could signal an increased focus on the Semantic Web within Google..."
According to the Google blog article: "Today, we've acquired Metaweb, a company that maintains an open database of things in the world.
Working together we want to improve search and make the web richer and more meaningful for everyone... With efforts like rich snippets and the search answers feature, we're just beginning to apply our understanding of the web to make search better... In addition to our ideas for search, we're also excited about the possibilities for Freebase, Metaweb's free and open database of over 12 million things, including movies, books, TV shows, celebrities, locations, companies and more.
Google and Metaweb plan to maintain Freebase as a free and open database for the world. Better yet, we plan to contribute to and further develop Freebase and would be delighted if other web companies use and contribute to the data. We believe that by improving Freebase, it will be a tremendous resource to make the web richer for everyone. And to the extent the web becomes a better place, this is good for webmasters and good for users..."
http://www.informationweek.com/blog/main/archives/2010/07/google_gets_sem.html
See also Jack Menzel's blog article: http://googleblog.blogspot.com/2010/07/deeper-understanding-with-metaweb.html
"Creating mashups in web applications can be a headache. Developers need to know intensive JavaScript, RSS, and Atom parsing, JSON parsing, and parsing other formats. You also need to study the low-level APIs provided by the mashup service providers and write a great deal of code to integrate the mashups with the web applications.
The Mashups4JSF components interact with the mashup services through the client-side APIs or the REST APIs offered by the mashup service providers. Mashups4JSF provides a set of factories that wraps the implemented services for each mashup service provider. For now, Mashups4JSF has factories for (Google, Yahoo!, YouTube, Twitter, and Digg). Using this architecture allows you to easily add services for the current supported mashup service providers and easily add more factories for new mashup service providers. Another advantage of this architecture is that the wrappered mashup services are totally decoupled from the Mashups4JSF components so the mashup services can be used independently.
This article illustrates the architecture of Mashups4JSF, the configuration of the library, and how to create a mashup application with few lines of code using Mashups4JSF and the IBM JSF Widget Library
(JWL) on the WebSphere Application Server V7.0 and JSF 2...
Mashups4JSF aims to offer the declarative mashups for the development community to be a complementary for the work done by GMaps4JSF. In future articles, I will explain the other features of Mashups4JSF like the (ATOM/RSS) feed producer service, give more interactive examples of the other Mashups4JSF components, and I will illustrate how Mashups4JSF can work inside a portlets environment..."
http://www.ibm.com/developerworks/opensource/library/os-mashups4JSF/index.html
"Many digital libraries have not made the transition to semantic digital libraries, and often with good reason. Librarians and information technologists may not yet grasp the value of semantic mappings of bibliographic metadata, they may not have the resources to make the transition and, even if they do, semantic web tools and standards have varied in terms of maturity and performance. Selecting appropriate or reasonable classes and properties from ontologies, linking and augmenting bibliographic metadata as it is mapped to triples, data fusion and re-use, and considerations about what it means to represent this data as a graph, are all challenges librarians and information technologists face as they transition their various collections to the semantic web.
This paper presents some lessons we have learned building small, focused semantic digital library collections that combine bibliographic and non-bibliographic data, based on specific topics. The tools map and augment the metadata to produce a collection of triples. We have also developed some prototype tools atop these collections which allow users to explore the content in ways that were either not possible or not easy to do with other library systems...
The semantic web depends upon simple statements of fact. A statement contains a predicate, which links two nodes: an object and a property.
The semantic web allows for the use of numerous sources for predicates.
This enables making fine grained-assertions about anything... In all, we use about 25 properties and classes from a dozen ontologies in our mapping of disparate data sources into instance data serialized as RDF/XML, and loaded into Sesame semantic repositories manually, and via the openRDF API. We have developed mapping tools that map and augment content in MARC XML format, and OAI Dublin Core content from OAI repositories, RSS and Atom news feed content, and we also have custom tools for processing structured (XML) content, and data from relational databases. The XML and database mapping tools currently must be adapted for each new data source, so they represent one of the more brittle, and labor-intensive aspects of the technologies.
Users can export files representing these graphs via web services.
Supported formats include Pajek's net format, GraphViz's DOT formation, the Guess data format, and GraphML. This enables more sophisticated users to use other graph visualization tools for exploring the data, or to merge data from multiple sources. Restricting ourselves to a handful of ontologies by no means limits us in the future. The semantic web allows us to add new triples, using new predicates, whenever the need arises. There is no database schema to update, and the SPARQL query language is forgiving enough to allow changes to the underlying data, with minimal or no changes necessary for queries to continue to work against old and new data..."
http://www.dlib.org/dlib/july10/powell/07powell.html
See also the OpenRDF.org web site: http://www.openrdf.org/
"The HTML5 specification redefines b and i elements to have some semantic function, rather than purely presentational. However, the simple fact that the tag names are 'b' for bold and 'i' for italic means that people are likely to continue using them as a quick presentational fix. This article explains why that can be problematic for localization (and indeed for restyling of pages in a single language), and echoes the advice in the specification intended to address those issues...
A general issue: Using 'b' and 'i' tags (elements) can be problematic because it keeps authors thinking in presentational terms, rather than helping them move to properly semantic markup. At the very least, it blurs the ideas. To an author in a hurry, it is tempting to just use one of these tags in the text to make it look different, rather than to stop and think about things like portability and future-proofing.
Internationalization problems can arise because presentation may need to differ from one culture to another, particularly with respect to things like bold and italic styling... Just because an English document may use italicisation for emphasis, document titles and idiomatic phrases in a foreign language, it doesn't hold that a Japanese translation of the document will use a single presentational convention for all three types of content. Japanese authors may want to avoid both italicization and bolding, since their characters are too complicated to look good in small sizes with these effects.
[So] You should bear in mind that the content of a 'b' markup element may not always be bold, and that of an 'i' element may not always be italic. The actual style is dependent on the CSS style definitions.
You should also bear in mind that bold and italic may not be the preferred style for content in certain languages. You should not use 'b' and 'i' tags if there is a more descriptive and relevant tag available. If you do use them, it is usually better to add class attributes that describe the intended meaning of the markup, so that you can distinguish one use from another..."
From the HTML5 draft specification: "The 'i' element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized...
The 'b' element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened... As with the 'i' element, authors are encouraged to use the class attribute on the 'b' element to identify why the element is being used, so that if the style of a particular use is to be changed at a later date, the author doesn't have to go through annotating each use. The 'b' element should be used as a last resort when no other element is more appropriate.
In particular, headings should use the 'h1' to 'h6' elements, stress emphasis should use the em element, importance should be denoted with the strong element, and text marked or highlighted should use the mark element..."
http://www.w3.org/International/questions/qa-b-and-i-tags
See also the draft HTML5 specification: http://dev.w3.org/html5/spec/Overview.html#the-i-element
"The U.S. White House has published a draft of a strategy designed to make the concept of trusted identities and authentication more of a reality in the digital world. In a 39-page document ('National Strategy for Trusted Identities in Cyberspace: Creating Options for Enhanced Online Security and Privacy'), the White House promotes what it calls the Identity Ecosystem, an interoperable environment where individuals, organizations and devices can trust each other because authoritative sources establish and authenticate their digital identities.
The proposed ecosystem will consist of three main layers: a governance layer that establishes the rules of the environment; a management layer that applies and enforces the rules of the ecosystem; and the execution layer that conducts transactions in accordance with the rules.
The U.S. Department of Homeland Security will be collecting comments from the public on the document until July 19, 2010..."
From the document's Executive Summary: "Privacy protection and voluntary participation are pillars of the Identity Ecosystem. The Identity Ecosystem protects anonymous parties by keeping their identity a secret and sharing only the information necessary to complete the transaction.
For example, the Identity Ecosystem allows an individual to provide age without releasing birth date, name, address, or other identifying data.
At the other end of the spectrum, the Identity Ecosystem supports transactions that require high assurance of a participant's identity.
The Identity Ecosystem reduces the risk of exploitation of information by unauthorized access through more robust access control techniques.
Finally, participation in the Identity Ecosystem is voluntary for both organizations and individuals...
Interoperability and privacy protection combine to create a user-centric Identity Ecosystem. Usercentricity will allow individuals to select the interoperable credential appropriate for the transaction. Through the creation and adoption of privacy-enhancing policies and standards, individuals will have the ability to transmit no more than the amount of information necessary for the transaction, unless they choose otherwise. In addition, such standards will inhibit the linking of an individual's transactions and credential use by service providers.
Individuals will have more confidence that they exchange information with the appropriate parties, securely transmit that information, and have the information protected in accordance with privacy best practices..."
http://www.eweek.com/c/a/Security/US-Outlines-Security-Strategy-for-Online-Identity-125949/
See also the online document: http://www.dhs.gov/xlibrary/assets/ns_tic.pdf
"Android applications often must access data that resides on the Internet, and Internet data can be structured in several different formats. This article illustrates how to build an Android application that works with two popular data formats -- XML and JavaScript Object Notation (JSON) -- as well as the more exotic protocol buffers format from Google...
First we develop a Web service that converts CSV data into XML, JSON, and protocol-buffers formats. Then we build a sample Android application that can pull the data from the Web service in any of these formats and parse it for display to the user.
XML is a first-class citizen on Android, which is a good thing given how many Web services rely on XML. Many services also support JSON, another popular format. It is usually a little more compact than XML, but it is still human-readable, making it easy to work with and easy to debug applications that use it. Android includes a JSON parser...
Protocol buffers is a language-agnostic data-serialization format developed by Google, designed to be faster than XML for sending data over a network. It is the de facto standard at Google for any server-to-server calls. Google made the format and its binding tools for the C++, Java, and Python programming languages available as open source..."
http://www.ibm.com/developerworks/xml/library/x-dataAndroid/
See also the Protocol Buffers overview: http://code.google.com/apis/protocolbuffers/docs/overview.html
"HTML 5 comes with plenty of new features for mobile Web applications, including visual ones that usually make the most impact. Canvas is the most eye-catching of the new UI capabilities, providing full 2-D graphics in the browser. In this article you learn to use Canvas as well as some of the other new visual elements in HTML 5 that are more subtle but make a big difference for mobile users...
The article is a whirlwind tour of many of the new UI-related features in HTML 5, from new elements to new style to the drawing canvas. These features, with the exception for a few notable exceptions at the end, are all available for you to use on the Webkit-based browsers found on the iPhone and on Android-based devices. Other popular platforms like the Blackberry and Nokia smartphones are getting more powerful browsers that also leverage the same technologies you have looked at in this article.
As a mobile Web developer you have the opportunity to target a wide range of users with visual features more powerful than anything you have ever had access to with HTML, CSS, and JavaScript on desktop browsers.
The previous four parts of this series talked about many other new technologies (like geolocation and Web Workers) that are available to you on these amazing new mobile browsers. The mobile Web is not some weaker version of the Web you have programmed for years; it is a more powerful version full of possibilities..."
http://www.ibm.com/developerworks/library/x-html5mobile5/index.html
See also the HTML5 specification: http://dev.w3.org/html5/spec/Overview.html
Real-time web applications allow users to receive notifications as soon as information is published, without needing to check the original source manually for updates. They have been popularized by social-notification tools like Twitter and Friendfeed, web-based collaboration tools like Google Wave, and web-based chat clients like Meebo.
The Extensible Messaging and Presence Protocol (XMPP) is an XML-based set of technologies for real-time applications, defined as networked applications that continually update in response to new or changed data. It was originally developed as a framework to support instant messaging and presence applications within enterprise environments...
This tutorial introduces you to the real-time web and takes you through some of the reasons for building real-time web applications.
You learn techniques that allow you to create responsive, continually updated web applications that conserve server resources while providing a slick user experience.
http://www.ibm.com/developerworks/xml/tutorials/x-realtimeXMPPtut/index.html
"The W3C cheatsheet for Web developers is a compact Web application that provides quick access to useful information from various W3C specs.
Making that Web app mobile friendly has always been one of its design
goals: it uses a very compact layout, the JavaScript-based auto-complete search was tweaked to work reasonably well with mobile keyboards (including virtual keyboards), it uses HTML5's ApplicationCache to be usable off-line in browsers that support it.
One of the W3C Working Groups, the Web Applications Working Group is developing a stack of specifications to make it easier to develop applications with widgets. There are quite a few similar efforts in various communities: Nokia's Web runtime engine, Firefox add-ons, Chrome extensions, and Safari extensions to name a few. It will be interesting to see if all these efforts end up converging toward the current (or a future revision of) the W3C widgets specifications.
The W3C Cheat Sheet on Android is obviously not particularly an endorsement of Android, even less so an endorsement of the world of applications markets; a growing number of people seem to see these markets as in opposition to the Web -- my personal opinion is that they're probably complementary, the same way a Web portal or a social bookmarking service are complementary to search engines
This Cheat Sheet on Android allows quick access to: (1) the description of the various language tokens (elements, attributes, properties, functions, etc) of HTML, CSS, SVG and XPath, through the text entry box on the Search tab; when you start typing a string, a drop down menu appears, allowing to select a token among those that match what you have typed; (2) the summary of the Mobile Web Best Practices, under the mobile tab; (3) the Web Content Accessibility Guidelines 2.0 at a glance, under the accessibility tab; (4) the internationalization quicktips under the I18N tab; (5) and some typography reminders in the typography tab..."
http://www.w3.org/QA/2010/06/w3c_cheatsheet_on_android_mark.html
See also the compiled Android application: http://dev.w3.org/2009/cheatsheet/doc/android
The next version of Opera's browser adds support for more HTML5 features, and is now available in beta... The hype surrounding HTML5 is growing, but the standard also holds the promise to change the way the Web is used. It is a huge step on the way to turning the browser and the Web into a proper platform for running applications, according to Jan Standal, vice president of desktop products at Opera.
Implementing HTML5 is, just like the standard, is a work in progress.
In version 10.6 Opera has expanded the browser's video capabilities by adding the new, open WebM file format, which Google announced last month. Mozilla, Opera, Adobe, and more than 40 other vendors back the standard, according to the project's Web site. The format is looking very promising, said Standal. Opera has also added AppCache, which is one of the components that will make it possible to run Web applications without being online...
In addition to these HTML5 improvements, Opera has also implemented the Geolocation API, which is being developed by W3C, and Web Workers, developed by the Web Hypertext Application Technology Working Group (WHATWG)..."
The W3C Geolocation API "defines a high-level interface to location information associated only with the device hosting the implementation, such as latitude and longitude. The API itself is agnostic of the underlying location information sources. Common sources of location information include Global Positioning System (GPS) and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth MAC addresses, and GSM/CDMA cell IDs, as well as user input.
No guarantee is given that the API returns the device's actual location. The API is designed to enable both "one-shot" position requests and repeated position updates, as well as the ability to explicitly query the cached positions. Location information is represented by latitude and longitude coordinates. The Geolocation API in this specification builds upon earlier work in the industry..."
http://www.infoworld.com/d/applications/opera-gains-more-html5-features-422
See also the W3C Geolocation API Specification: http://dev.w3.org/geo/api/spec-source.html
"On the web, feeds are machine-readable summaries of content, usually arranged in reverse chronological order. Most feeds have traditionally been used to syndicate blog content in the popular RSS or Atom XML-based formats. Once published by a site, the content can be read through user-friendly aggregators, or transformed and interpreted by networked software products... Syndicated feeds have been consumed in this way since 1999. However, in recent years, web users have also been consuming content in a more social way, through sites like Facebook, MySpace, and Twitter...
An activity stream (also sometimes called a lifestream) is the collection of all the activities a person undertakes on a particular site. As web users rely more and more on activity streams for information consumption, it makes sense to be able to syndicate and subscribe to activity stream data. But since RSS and Atom don't support social metadata, a new format is needed to syndicate social activity...
Activity Streams emerged from the DiSo Project, an open source effort to build a decentralized social web using plug-ins developed for the WordPress blogging platform as a starting point. In the DiSo model, each user's profile is a separate WordPress blog that can be hosted on any Internet-connected infrastructure. Social actions then occur across the Internet among these WordPress sites.
XML is the perfect technology for implementing this approach, because it is cross-platform, easy to publish and parse, and doesn't require any specialist technology. The DiSo Project went one step further and conceived of the Activity Streams standard as an extension of the Atom feed format... In March 2009, MySpace became the first major social media provider to publish feeds in the Activity Streams format. Since then, many more have followed, including Facebook, Hulu, TypePad, and Opera. But the scope for Activity Streams isn't limited to sites like Facebook. Intranets, for example, can be greatly enhanced by knowledge of social activity within a company, or between companies..."
http://www.ibm.com/developerworks/xml/library/x-activitystreams/
See also the Activity Streams web site: http://activitystrea.ms/
"In response to a user recently, I told him he had fallen into the most common elephant trap for XSLT users. Rather than being annoyed, which I half expected, he thanked me and asked me if I could tell him what the next most common elephant traps were. Although some of us have been helping users avoid these traps for many years, I don't recall seeing a list of them, so I thought I would spend half an hour compiling my own list...
(1) Matching elements in the default namespace... If the source document contains a default namespace declaration 'xmlns="something"', then every time you refer to an element name in an XPath expression or match pattern, you have to make it clear you are talking about names in that namespace.
(2) Using relative paths: 'xsl:apply-templates' and 'xsl:for-each' set the context node; within the 'loop', paths should be written to start from this context node...
(3) Variables hold values, not fragments of expression syntax... Some people imagine that a variable reference '$x'
is like a macro, expanded into the syntax of an XPath expression by textual substitution -- rather like variables in shell script languages.
It isn't: you can only use a variable where you could use a value...
(4) Template rules and 'xsl:apply-templates' are not an advanced feature to be used only by advanced users. They are the most basic fundamental construct in the XSLT language. Don't keep putting off the day when you start to use them. If you aren't using them, you are making your life unnecessarily difficult...
(5) XSLT takes a tree as input, and produces a tree as output. Failure to understand this accounts for many of the frustrations beginners have with XSLT. XSLT can't process things that aren't represented in the tree produced by the XML parser (CDATA sections, entity references, the XML declaration) and it can't generate these things in the output either...
(6) Namespaces are difficult. There are no easy answers to getting them right: this probably needs another article of its own; the key is to understand the data model for namespaces...
(7) Don't use disable-output-escaping: Some people use it as magic fairy dust; they don't know what it does, but they hope it might make things work better... This attribute is for experts only, and experts will only use it as an absolute last resort...
(8) The 'xsl:copy-of' instruction creates an exact copy of a source tree, namespaces and all. If you want to copy a tree with changes, then you can't use 'xsl:copy-of'. Instead, use the identity-template coding pattern...
(9) Don't use [xsl:variable name="x"][xsl:value-of select="y"/] [/xsl:variable]. Instead use [xsl:variable name="x" select="y"/]... The latter is shorter to write, and much more efficient to execute, and in many cases it's correct where the former is incorrect. (10) When you need to search for data, use keys. As with template rules, don't put off learning how to use keys or dismiss them as an advanced feature...."
http://saxonica.blogharbor.com/blog/_archives/2010/6/11/4550606.html
See also the Saxonica home page: http://www.saxonica.com/
Technical comment is invited for the W3C First Public Working Draft of "Requirements and Use Cases for XSLT 2.1." This specification was produced by members of the W3C XSL Working Group, which is part of the XML Activity. The Working Group expects to eventually publish this document as a Working Group Note. The Working Group requests that errors in the document be reported using W3C's public Bugzilla system; please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make... XSLT Version 2.1 is a language for transforming XML documents into other XML documents, and constitutes a revised version of the XSLT 2.0 Recommendation published on 23-January-2007. The primary purpose of the changes in this version of the language is to enable transformations to be performed in streaming mode, where neither the source document nor the result document is ever held in memory in its entirety. XSLT 2.1 is designed to be used in conjunction with XPath 2.1. XSLT shares the same data model as XPath 2.1, which is defined in the Data Model, and it uses the library of functions and operators. XPath 2.1 and the underlying function library introduce a number of enhancements, for example the availability of higher-order functions. Some of the functions that were previously defined in the XSLT 2.0 specification, such as the format-date and format-number functions, are now defined in the standard function library to make them available to other host languages. XSLT 2.1 also includes optional facilities to serialize the results of a transformation, by means of an interface to the serialization component...
The WD document provides a representation of requirements and use cases for "XSL Transformations (XSLT) Version 2.1", published as a W3C Working Draft on 11-May-2010. The Requirements lists enhancements requested over time that may be addressed in XSLT 2.1. The document is organized in three major sections: Requirements, Real-World Scenarios, and Tasks.
Sample Data are provided in document Appendices. There are sixteen (16) Requirements, including Enabling Streamable Processing; Modes and Schema-awareness; Composite Keys; The 'xsl:analyze-string' Instruction Applied to an Empty Sequence; Context Item for a Named Template; Traditional Hebrew Numbering; Separate Compilation of Stylesheet Modules; The 'start-at' Attribute of 'xsl:number'; Allowing 'xsl:variable' before 'xsl:param'; Combining 'group-starting-with' and 'group-ending-with'; Improvements to Schema for Stylesheets; Setting Initial Template Parameters; Invoking XQuery from XSLT; Enhancement to Sorting and Grouping; Enhancement to Conditional Modes; Default Initial Template.
Real World Scenarios illustrate when real users reach limits of existing XML transformation standards. These use cases are elaborated in form of short stories, and include: Transforming MPEG-21 BSDL; Validation of SOAP Digital Signatures; Transformation of the RDF Dump of the Open Directory; Transformations on a Cell Phone; XSL FO Multiple Extraction/Processing; EFT/EDI Transformation.
Tasks are examples of relatively simple transformations whose definitions in XSLT 2.0 are not easy, straightforward or even possible. Some of these tasks are difficult solely because of the fact that one or more input or output XML documents is so large that the entire document cannot be held in memory. Other difficulties are related to merging and forking documents, restricted capabilities to iterate and the lack of common constructs (dynamic evaluation of expressions, try/catch). The transformation task illustrating troubles with huge XML documents can be defined in XSLT 2.0; the processor can even recognize that there is no need to keep the entire document in memory and can run the transformation in a memory-efficient way in some cases. But there no guarantee of this behavior. New facilities suggested for XSLT 2.1 aim to guarantee that a transformation must be processed in a streaming manner. Enumerated tasks include Splitting Flat Data; Splitting Nested Data; Joining; Concatenation; Adding Children; Renaming and Counting Nested Elements; Renaming and Counting Nested Elements and Counting Other Elements; Filtering According to Attribute; Filtering According to Child; Histogram; Hierarchical to Flat; Flat to Hierarchical; CSV Result; Local Sorting; Resolving References; Multiple Extraction/Processing; Grouping; Iterations; Making Explicit Sections; Merging Sorted Sequences.
http://www.w3.org/TR/2010/WD-xslt-21-requirements-20100610/
See also XSL Transformations (XSLT) Version 2.1: http://www.w3.org/TR/2010/WD-xslt-21-20100511/
"With Google and Apple strongly supporting HTML5 as the solution for rich applications for the Internet, it's become the buzzword of the month -- particularly after Google I/O. Given its hot currency, though, it's not surprising that the term is starting to become unhinged from reality. Already, we're starting to see job postings requiring 'HTML5 experience,' and people pointing to everything from simple JavaScript animations to CSS3 effects as examples of HTML5. Just as 'AJAX' and 'Web 2.0' became handy (and widely misused) shorthand for 'next- generation' web development in the mid-2000's, HTML5 is now becoming the next overloaded term...
When many folks say 'HTML5' they mean the broad collection technologies that are now being implemented in the Webkit-based browsers (Safari and Chrome), Opera and Firefox... The core W3C HTML5 spec is just one part of the collection of related technologies: The HTML5 specification proper; Cascading Style Sheets Version 3 (CSS3); Web Workers; Web Storage; Web SQL Database; Web Sockets; Geolocation; Microdata; Device API and File API...
The Web Sockets protocol is in the first stage of the standards process and has also been submitted as an IETF draft because it is a networking protocol. It defines a non-http-based asynchronous client/server protocol that can be used in place of the current AJAX methods for asynchronous server communication. It uses an initial http: request to bootstrap the new protocol.
Geolocation is a simple spec that provides a built-in a geolocation object that scripts can query. It also provides methods for defining location cache freshness requirements. This is fairly non-controversial and already in new browsers. File API allows single and multiple file uploads from the user desktop. It's unclear exactly who will support this, but there doesn't seem to be much confusion about what it's supposed to do..."
http://www.drdobbs.com/article/printableArticle.jhtml?articleId=225300318
Members of the W3C RDB2RDF Working Group invite technical comment on the First Public Working Draft of "Use Cases and Requirements for Mapping Relational Databases to RDF." The need to share data with collaborators motivates custodians and users of relational databases (RDB) to expose relational data on the Web of Data. This document examines a set of use cases from science and industry, taking relational data and exposing it in patterns conforming to shared RDF schemata. These use cases expose a set of functional requirements for exposing relational data as RDF in the RDB2RDF Mapping Language (R2RML)... br The majority of dynamic Web content is backed by relational databases (RDB), and so are many enterprise systems. On the other hand, in order to expose structured data on the Web, Resource Description Framework (RDF) is used. This document reviews use cases and requirements for a relational database to RDF mapping (RDB2RDF) with the following structure: (1) The remainder of this section motivates why mapping RDBs to RDF is necessary and needed and highlights the importance of a standard. (2) In the next section RDB2RDF use cases are reviewed. (3) The last section discusses requirements regarding a RDB2RDF mapping language, driven by an analysis of the aforementioned use cases... br Use of a standard for mapping language for RDB to RDF may allow use of a single mapping specification in the context of mirroring of schema and (possibly some or all of the) data in various databases, possibly from different vendors (e.g., Oracle database, MySQL, etc.) and located at various sites. Similarly structured data (that is, data stored using same schema) is useful in many different organizations often located in different parts of the world. These organizations may employ databases from different vendors due to one or more of many possible factors (such as, licensing cost, resource constraints, availability of useful tools and applications and of appropriate database administrators, etc.). Presence of a standard RDB2RDF mapping language allows creation and use of a single mapping specification against each of the hosting databases to present a single (virtual or materialized) RDF view of the relational data hosted in those databases and this RDF view can then be queried by applications using SPARQL query or protocol... Another reason for a standard is to allow easy migration between different systems. Just as a single web-page in HTML can be viewed by two different Web browsers from different vendors, a single RDB2RDF mapping standard should allow a user from one database to expose their data as RDF, and then, when they export their data to another database, allow the newly imported data to be queried as RDF without changing the mapping file. br The mission of the RDB2RDF Working Group, part of the Semantic Web Activity, is "to standardize a language for mapping relational data and relational database schemas into RDF and OWL, tentatively called the RDB2RDF Mapping Language, R2RML. The mapping language defined by the WG will facilitate the development of several types of products. It could be used to translate relational data into RDF which could be stored in a triple store. This is sometimes called Extract-Transform-Load (ETL).br Or it could be used to generate a virtual mapping that could be queried using SPARQL and the SPARQL translated to SQL queries on the underlying relational data. Other products could be layered on top of these capabilities to query and deliver data in different ways as well as to integrate the data with other kinds of information on the Semantic Web..." br http://www.w3.org/TR/2010/WD-rdb2rdf-ucr-20100608/br See also the W3C RDB2RDF Working Group: http://www.w3.org/2001/sw/rdb2rdf/
Close
W3C has announced publication of a First Public Working Draft for a specification "RDFa API: An API for Extracting Structured Data from Web Documents." The document was produced by the RDFa Working Group, which was chartered to support the developing use of RDFa for embedding structured data in Web documents in general.
"This document details such a mechanism; an RDFa Document Object Model Application Programming Interface (RDFa DOM API) that allows simple extraction and usage of structured information from a Web document.
RDFa API provides a mechanism that allows Web-based applications using documents containing RDFa markup to extract and utilize structured data in a way that is useful to developers. The specification details how a developer may extract, store and query structured data contained within one or more RDFa-enabled documents. The design of the system is modular and allows multiple pluggable extraction and storage mechanisms supporting not only RDFa, but also Microformats, Microdata, and other structured data formats. For more information about the Semantic Web, please see the Semantic Web Activity.
RDFa provides a means to attach properties to elements in XML and HTML documents. Since the purpose of these additional properties is to provide information about real-world items, such as people, films, companies, events, and so on, properties are grouped into objects called Property Groups. The RDFa DOM API provides a set of interfaces that make it easy to manipulate DOM objects that contain information that is also part of a Property Group. This specification defines these interfaces. A document that contains RDFa effectively provides two data layers. The first layer is the information about the document itself, such as the relationship between the elements, the value of its attributes, the origin of the document, and so on, and this information is usually provided by the Document Object Model, or DOM.
The second data layer comprises information provided by embedded metadata, such as company names, film titles, ratings, and so on, and this is usually provided by RDFa, Microformats, DC-HTML, GRDDL, or Microdata.
Whilst this embedded information could be accessed via the usual DOM interfaces -- for example, by iterating through child elements and checking attribute values -- the potentially complex interrelationships between the data mean that it is more efficient for developers if they have access to the data after it has been interpreted. For example, a document may contain the name of a person in one section and the phone number of the same person in another; whilst the basic DOM interfaces provide access to these two pieces of information through normal navigation, it is more convenient for authors to have these two pieces of information available in one property collection, reflecting the final Property Group..."
http://www.w3.org/TR/2010/WD-rdfa-api-20100608/
See also the RDFa Primer: http://www.w3.org/TR/xhtml-rdfa-primer/
"After more than 10 years in development, the STIX Fonts are now available for free download from the Scientific and Technical Information Exchange Project web site. The mission of the (STIX) font creation project is the preparation of a comprehensive set of fonts that serve the scientific and engineering community in the process from manuscript creation through final publication, both in electronic and print formats. Toward this purpose, the STIX fonts will be made available, under royalty-free license, to anyone, including publishers, software developers, scientists, students, and the general public.
The Unicode-based OpenType fonts have been designed to support the full range of characters and symbols needed in STM publishing, for both print and online formats. The fonts include more than 8,000 glyphs in multiple weights, sizes, and slants and support the complete range of Latin alphabets, as well as Greek and Cyrillic. The largest component of the fonts is devoted to the thousands of mathematical operators and technical symbols necessary to report research.
The initial release (version 1.0) provides the STIX Fonts as a set of
23 OpenType fonts, a format suitable for use by most dedicated STM typesetting programs, equation editors, and other applications. A second release (version 1.1) containing advanced OpenType support required by applications like Microsoft Office will follow by the end of 2010. The third release (version 1.2) will be a set of Type 1 fonts suitable for use with LaTeX, a standard tool in the mathematics and science communities and is expected to be completed in 2011. The STIX Fonts are released under the SIL Open Font License (OFL), a license designed specifically for collaborative font projects and to provide a free and open framework in which fonts may be shared and improved in partnership with others.
Speaking on behalf of the STI Pub coalition of publishers who developed the fonts, Tim Ingoldsby of the American Institute of Physics said:
'This project, which received funding support as well as the contributions of many staff members from the American Chemical Society, American Institute of Physics, American Mathematical Society, American Physical Society, Institute of Electrical and Electronics Engineers, and Elsevier, represents a significant step forward in the STM publishing process. Now individual researchers can come to a single source to obtain a free set of fonts that they can be assured contains substantially every character or symbol needed for reporting their results'..."
http://www.aip.org/press_release/stixfonts_v1.0_released.html
See also the STI Pub Coalition web site: http://www.stixfonts.org/stipubs.html
"Google has unveiled an open-source, royalty-free video format called WebM, lining up commitments from Mozilla and Opera to support the encoding technology in their browsers and pledging to support it on its YouTube site... It's not yet clear how much success Google will have spreading WebM, but the company has big Web ambitions, a powerful brand, heavy influence through the popularity of YouTube, and deep pockets to help handle any legal threats to the WebM project...
The format is based on the VP8 technology that Google acquired from On2 Technologies in February. It also uses the Ogg Vorbis audio technology from the Xiph.Org Foundation. The 'codec' technology for encoding and decoding video competes with H.264, a format that Apple and Microsoft prefer but that comes with steep licensing fees and restrictions that keep it out of open-source software. That includes Mozilla's Firefox and Google's Chromium, the open-source project underlying its Chrome browser..."
From the WebM Project web site: "What is WebM? WebM is an open, royalty-free media file format designed for the web. WebM files consist of video streams compressed with the VP8 video codec and audio streams compressed with the Vorbis audio codec. The WebM file structure is based on the Matroska media container. VP8 is a highly efficient video compression technology that was developed by On2 Technologies.... A key factor in the web's success is that its core technologies such as HTML, HTTP, and TCP/IP are open and freely implementable. Though video is also now core to the web experience, there is unfortunately no open and free video format that is on par with the leading commercial choices. To that end, we started the WebM project, a broadly-backed community effort to develop an open web media format.
WebM was built for the web. By testing hundreds of thousands of videos with widely varying characteristics, we found that the VP8 video codec delivers high-quality video while efficiently adapting to varying processing and bandwidth conditions across a broad range of devices.
VP8's highly efficient bandwidth usage and lower storage requirements can help publishers recognize immediate cost savings. Also, the relative simplicity of VP8 makes it easy to integrate into existing environments and requires comparatively little manual tuning in the encoder to produce high-quality results..."
http://news.cnet.com/8301-30685_3-20005378-264.html
See also the WebM Project web site: http://www.webmproject.org/about/
"HTML 5 is a very hyped technology, but with good reason. It promises to be a technological tipping point for bringing desktop application capabilities to the browser. As promising as it is for traditional browsers, it has even more potential for mobile browsers. Even better, the most popular mobile browsers have already adopted and implemented many significant parts of the HTML 5 specification.
In this five-part series, you will take a closer look at several of those new technologies that are part of HTML 5, that can have a huge impact on mobile Web application development. In each part of this series you will develop a working mobile Web application showcasing an HTML 5 feature that can be used on modern mobile Web browsers, like the ones found on the iPhone and Android-based devices...
Geolocation by itself is somewhat of a novelty. It allows you to determine where the user is. However, just knowing this and reporting it to the user would not be very useful. Why would anyone care about their exact latitude and longitude? It is when you start using this in combination with other data and services that can make use of location, that you start to produce some interesting results. Almost all of these services will want a user's latitude and longitude as part of their input. Often this is all you need...
[The article thus shows you] how to use geolocation APIs in a mobile Web application. GPS can sound very sexy, but complicated. However, as see here, the W3C standard for geolocation provides a very simple API. It is straightforward to get a user's location and track that location over time. From there you can pass the coordinates to a variety of Web services that support location, or perhaps you have your own location aware service that you are developing. In Part 2 of this series on HTML 5 and mobile Web applications, we will look at how to take advantage of local storage to improve the performance of mobile Web applications..."
http://www.ibm.com/developerworks/library/x-html5mobile1/index.html
See also HTML5, vocabulary and associated APIs for HTML and XHTML: http://dev.w3.org/html5/spec/Overview.html
Google has released a programming tool to help move its Native Client project -- and more broadly, its cloud-computing ambitions -- from abstract idea to practical reality. The new Native Client software developer kit, though only a developer preview version, is designed to make it easier for programmers to use the Net giant's browser- boosting Native Client technology...
To let people download Native Client modules from Web pages without security problems, NaCl prohibits various operations and confines NaCl program modules to a sandbox with restricted privileges. NaCl lets programmers write in a variety of languages, and a special compiler converts their work into the NaCl modules. The ultimate promise of NaCl is that Web-based applications could run much faster than those of today that typically use JavaScript or Adobe Systems' Flash. If Google can attract developers, the Web and cloud computing could become a much more powerful foundation for programs. The NPAPI Pepper project and NaCl SDK show one of Google's biggest challenges in bringing its cloud-computing vision to reality, though: getting others to come along for the ride. To make NaCl real, it must convince programmers to use the software, convince browser makers to include it or at least support it as a plug-in, and convince the general public to upgrade their browsers to use it..."
From the Blog article: "Today, we're happy to make available a developer preview of the Native Client SDK -- an important first step in making Native Client more accessible as a tool for developing real web applications. When we released the research version of Native Client a year ago, we offered a snapshot of our source tree that developers could download and tinker with, but the download was big and cumbersome to use. The Native Client SDK preview, in contrast, includes just the basics you need to get started writing an app in minutes: a GCC-based compiler for creating x86-32 or x86-64 binaries from C or C++ source code, ports of popular open source projects like zlib, Lua, and libjpeg, and a few samples that will help you get you started developing with the NPAPI Pepper Extensions. Taken together, the SDK lets you write C/C++ code that works seamlessly in Chromium and gives you access to powerful APIs to build your web app...
To get started with the SDK preview, grab a copy of the download at code.google.com/p/nativeclient-sdk. You'll also need a recent build of Chromium started with the '--enable-nacl' command-line flag to test the samples and your apps. Because the SDK relies on NPAPI Pepper extensions that are currently only available in Chromium, the SDK won't work with the Native Client browser plug-ins..."
http://news.cnet.com/8301-30685_3-20004874-264.html
See also the Blog article: http://blog.chromium.org/2010/05/sneak-peek-at-native-client-sdk.html
Ext JS is an advanced JavaScript framework that not only supports and simplifies the foundations of Asynchronous JavaScript and XML (Ajax) development, but also maintains a large toolkit of reusable UI components.
In this article, the author presents the new features and updates to the Ext JS framework, which currently stands at version 3.1.
Traditionally, developing with Ext JS has been about taking advantage of its bread and butter: the UI component framework. It is clearly one of the most advanced around and is miles ahead of its competitors.
However, what if I just need to sprinkle in some Ajax or query and style a portion of the DOM? This is where Ext Core comes in. Meant to serve as a competitor to other popular frameworks such as Prototype/ script.aculo.us and jQuery, Ext Core is a lightweight distribution containing all the, well, core aspects you'd expect from a modern JavaScript framework. From element augmentation and DOM querying to Ajax and utility classes, Ext Core has everything you need to get started with advanced JavaScript development.
Ext JS has a history of confusion among developers that its licensing is closed and expensive. Ext JS is a for-profit company and certainly supports different models based on the intended use: open source, commercial, and OEM. However, the framework has remained open source and continues to benefit from a large group of community supporters who contribute user extensions and donate their time as forum moderators.
Ext Core continues this line of openness by being distributed under the permissive and easily understood Massachusetts Institute of Technology (MIT) license...
With all the great improvements to Ext JS, there is still a missing piece. There has never been an easy way to create custom themes in Ext JS, and this continues to be a disappointment. Although the CSS framework has been broken up into structural and visual bits, it does not address the need for a way to create the images used in the visual rendering of Ext JS components. Creating a theme for Ext JS 3.0 is more straightforward than in previous versions, but because components are image-heavy, it leaves the solution half-done..."
http://www.ibm.com/developerworks/web/library/wa-aj-extjs30/
See also the Ext JavaScript web site: http://www.extjs.com/
"Dean Hachamovitch, General Manager for Internet Explorer at Microsoft, has announced that IE9 will use only the H.264 standard to play HTML 5 video. Microsoft seems to have become very committed to HTML 5, while Flash loses even more ground. The announcement came the same day Steve Jobs detailed why Apple does not accept Flash on iPhone and iPad.
Microsoft seems to finally take HTML 5 very seriously. Hachamovitch
stated: 'The future of the web is HTML5' in the beginning of his blog post. And he added: 'HTML5 will be very important in advancing rich, interactive web applications and site design.' Many wondered what is Microsoft going to do about HTML 5 and, if they start integrating it, what video standard is it going to use? Microsoft has already started implementing some of the HTML 5 features...
Microsoft had to choose between the proprietary video standard H.264, a.k.a. Advanced Video Coding (AVC), and the free Ogg Theora codec.
Firefox and Opera are using the later while Chrome implements both.
Microsoft has announced IE9 will use only H.264. Hachamovitch also mentioned that developers writing applications for Windows won't have to pay licenses for using the H.264 codec, not even when accessing hardware accelerated features because the license fee is already included in Windows...
Coming back to HTML 5, Firefox and Opera are the only major browser vendors which are still committed to Ogg Theora which is known as providing less quality compared to H.264. They blame the license fees required to use the proprietary standard. The standard is licensed by MPEG LA, a packager of patent pools currently managing the rights for technologies like MPEG-2, MPEG-4, ATSC, or IEEE 1394. Apple, Microsoft, and Google are among the licensees, but Apple and Microsoft are also licensers...
The H.264 license policy is evaluated every 5 years... Different royalties apply for creating codecs based on H.264 that could be included or not in an operating system, consuming video on a subscription or title-by-title basis, or for Internet broadcasts of free TV shows... For example, adding a H.264 codec to an operating system involves a fee of $5 Million/year during 2009-2010... H.264 is used across a large variety of devices including 'set-top boxes, media player and other personal computer software, mobile devices including telephones and mobile television receivers, Blu-ray Disc players and recorders, Blu-ray video optical discs, game machines, personal media player devices, still and video cameras, subscription and pay-per view or title video services, free broadcast television services and other products'..."
http://www.infoq.com/news/2010/04/Microsoft-HTML-5-H.264
See also CNET News.com 'Patent Challenge Looming for Open-Source Codecs': http://news.cnet.com/8301-30684_3-20003895-265.html
The U.S. White House is requiring federal agencies to consider using a standard configuration developed by the Justice and Homeland Security departments to share information across the public and private sectors.
More than a month ago, the Office of Management and Budget issued guidance to agencies on the website of the National Information Exchange Model (NIEM), a joint DOJ-DHS program. The OMB document, which is not posted on its website, includes instructions for assessing the framework's merits by May 01, 2010.
OMB did not make the public aware of such plans to overhaul federal information exchange on its website, raising questions about a lack of transparency, as well as the security of the model, according to privacy advocates. OMB officials noted that the NIEM website is public and pointed out that other OMB requirements such as information security standards for the federal government also are posted on other agency sites.
Some privacy groups still have to review the specifications and therefore could not comment, while others urged the Obama administration to fully disclose security procedures if agencies proceed with NIEM. Security experts familiar with the information technology setup at Justice and DHS praised the integrity of the framework and the idea of rolling it out governmentwide... NIEM launched in 2005 with the goal of linking jurisdictions throughout the country to better respond to crises, including terrorist attacks, natural disasters, large-scale crime and other emergencies handled by Justice and Homeland Security. The standards are intended to expedite the secure exchange of accurate information.
This winter, the Health and Human Services Department announced it will use NIEM as the foundation of a nationwide network for medical professionals to exchange patient data. Some in the health IT community expressed fears that if other agencies are using the same framework as doctors, the government could access private health information. HHS officials have emphasized that harmonizing standards for information exchange will not facilitate the transmission of medical records to law enforcement or intelligence agencies..."
http://www.nextgov.com/nextgov/ng_20100421_4060.php
See also the NIEM technical specifications: http://www.niem.gov/TechnicalDocuments.php
At the F8 conference in San Francisco. Facebook introduced Open Graph protocol, and the Graph API as the next evolution in the Facebook platform. The Open Graph protocol was originally created at Facebook and is inspired by Dublin Core, link-rel canonical, Microformats, and RDFa.
Discussion of the Open Graph Protocol takes place on the developer mailing list. It is currently being consumed by Facebook and is being published by IMDb, Microsoft, NHL, Posterous, Rotten Tomatoes, TIME, Yelp, and others.
"On Facebook, users build their profiles through connections to what they care about -- be it their friends or their favorite sports teams, bottles of wine, or celebrities. The Open Graph protocol opens up the social graph and lets your pages become objects that users can add to their profiles. When a user establishes this connection by clicking Like on one of your Open Graph-enabled pages, you gain the lasting capabilities of Facebook Pages: a link from the user's profile, ability to publish to the user's News Feed, inclusion in search on Facebook, and analytics through our revamped Insights product...
Facebook introduced three new components of Facebook Platform two of which the Open Graph protocol, and the Graph API. The API provides access to Facebook objects like people, photos, events etc. and the connections between them like friends, tags, shared content etc. via a uniform and consistent URI to access the representation. Every object can be accessed using the the URL 'https://graph.facebook.com/ID', where ID stands for the unique ID for the object in the social graph. Every connection
(CONNECTION_TYPE) that the facebook object supports can be examined using the [designated] URL...
All of the objects in the Facebook social graph are connected to each other via relationships. The URIs also have a special identifier 'me'
which refers to the current user. The Graph API uses OAuth 2.0 for authorization, where the authentication guide has details of the Facebook's OAuth 2.0 implementation. OAuth 2.0 is a simpler version of OAuth that leverages SSL for API communication instead of relying on complex URL signature schemes and token exchanges. At a high level, using OAuth 2.0 entails getting an access token for a Facebook user via a redirect to Facebook. After you obtain the access token for a user, you can perform authorized requests on behalf of that user by including the access token in your Graph API requests...
http://www.infoq.com/news/2010/04/facebook-graph-api
See also the Open Graph Protocol specification web site: http://opengraphprotocol.org/
This paper was presented in the WWW 2010 Conference forum on Semi- Structured Data. "RDF is extensively being used for representing data from the fields of bioinformatics, life sciences, social networks, and Wikipedia as well. Since disk-space is getting cheaper, storing this huge RDF data does not pose as big a problem as executing queries on them. Querying these huge graphs needs scanning the stored data and indexes created over it, reading that data inside memory, executing query algorithms on it, and building the final results of the query.
Hence, a desired query processing algorithm is one which: (i) keeps the underlying size of the data small (using compression techniques), (ii) can work on the compressed data without uncompressing it, and (iii) doesn't build large intermediate results...
The Semantic Web community, until now, has used traditional database systems for the storage and querying of RDF data. The SPARQL query language also closely follows SQL syntax. As a natural consequence, most of the SPARQL query processing techniques are based on database query processing and optimization techniques. For SPARQL join query optimization, previous works like RDF-3X and Hexastore have proposed to use 6-way indexes on the RDF data. Although these indexes speed up merge-joins by orders of magnitude, for complex join queries generating large intermediate join results, the scalability of the query processor still remains a challenge.
In this paper, we introduce (i) BitMat -- a compressed bit-matrix structure for storing huge RDF graphs, and (ii) a novel, light-weight SPARQL join query processing method that employs an initial pruning technique, followed by a variable-binding-matching algorithm on BitMats to produce the final results. Our query processing method does not build intermediate join tables and works directly on the compressed data. We have demonstrated our method against RDF graphs of upto 1.33 billion triples -- the largest among results published until now (single-node, non-parallel systems), and have compared our method with the state-of-the-art RDF stores -- RDF-3X and MonetDB.
Our results show that the competing methods are most effective with highly selective queries. On the other hand, BitMat can deliver 2-3 orders of magnitude better performance on complex, low-selectivity queries over massive data..."
http://www.cs.rpi.edu/~zaki/PaperDir/WWW10.pdf
See also the WWW2010 conference papers: http://www2010.org/www/program/papers/
Members of the W3C RDFa Working Group have published a First Public Working Draft for the specification "RDFa Core 1.1: Syntax and Processing Rules for Embedding RDF Through Attributes." The document is intended to become a W3C Recommendation. A sample test harness is available, though its set of tests is not intended to be exhaustive.
Users may find the tests to be useful examples of RDFa usage. This document is expected to supersede the 'RDFa in XHTML (RDFa 1.0)'
specification
RDFa provides a set of XHTML attributes to augment visual data with machine-readable hints as well as providing a few new ones. Attributes that already exist in widely deployed languages (e.g., HTML) have the same meaning they always did, although their syntax has been slightly modified in some cases. For example, in (X)HTML, '@rel' already defines the relationship between one document and another. However, in (X)HTML there is no clear way to add new values; RDFa sets out to explicitly solve this problem, and does so by allowing URIs as values. It also introduces the idea of 'compact URIs' -- referred to as CURIEs in this document -- which allow a full URI value to be expressed succinctly...
Background: "The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user's desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa Core is a specification for attributes to express structured data in any markup language. The embedded data already available in the markup language (e.g., XHTML) is reused by the RDFa markup, so that publishers don't need to repeat significant data in the document content. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure...
RDFa shares some of the same goals with microformats. Whereas microformats specify both a syntax for embedding structured data into HTML documents and a vocabulary of specific terms for each microformat, RDFa specifies only a syntax and relies on independent specification of terms (often called vocabularies or taxonomies) by others. RDFa allows terms from multiple independently-developed vocabularies to be freely intermixed and is designed such that the language can be parsed without knowledge of the specific vocabulary being used..."
http://www.w3.org/TR/2010/WD-rdfa-core-20100422/
See also W3C Semantic Web: http://www.w3.org/standards/semanticweb/
"Mashups are an architectural style that combines data and/or content from different data sources or sites. Mashups are normally differentiated based on the use, architecture style, and data. While consumer mashups have been in use for a while, we're now see them moving into the enterprise. In a general sense, what differentiates a consumer mashup from an enterprise mashup scene is that enterprise mashups are built following standard guidelines such those promoted by the Open Mashup Alliance (OMA), a standard model proposed by mashup vendors. The OMA defines an Enterprise Mashup Markup Language (EMML) which is used to define mashups in a standardized manner. The mashup defined thus can be deployed in any of the mashup runtimes which is implemented as per the specifications provided by the OMA.
In this article, I examine at the importance of OMA, the mashup architecture proposed by OMA, and the ease with which developers can create, deploy, and test a mashup developed as per the EMML specification...
EMML supports an extensive set of operations and commands to handle simple to complex processing needs. EMML also supports the results of one mashup as input to another. All in all, this should give you a head start for developing more complicated mashups by referring to the detailed EMML documentation provided in the OMA portal.
EMML is declarative mashup domain-specific language which eliminates complex and procedural programming requirement to create mashups. It is an open specification and the language is free to use. EMML will thus remove any vendor lock-in and allows portability of the mashup solution.
OMA has released the EMML specification, EMML schema, and an open source reference implementation where EMML scripts can be deployed and tested...
An EMML file is the mashup script that has a ".emml" extension and uses the Enterprise Mashup Markup Language. The mashup script defines the services, operations and responses to be constructed based on the results generated. The input to the EMML script could be through any of the data sources such as XML, JSON, JDBC, Java Objects or Primitive types. EMML provides a uniform syntax to call any of the service styles -- REST, SOAP, RSS/ATOM, RDBMS, POJO or web clipping from HTML pages. Complex programming logic is also supported through an embedded scripting engine which supports JavaScript, Groovy, JRuby, XPath and XQuery. The EMML script is then deployed on to any J2EE compliant application server where the EMML runtime has been deployed. The mashup is then accessible as REST service using a URL with the mashup name. The Mashup service returns the result in XML format..."
http://www.drdobbs.com/java/224300049
See also the OMA EMML Documentation: http://www.openmashup.org/omadocs/v1.0/index.html
"The enthusiasm for Big Data applications has us putting persistent data solutions under a microscope these days. It must be noted that although Big Data applications involve operations with large data sets, their function can vary from online transaction processing to analytics to semantics-driven information retrieval. And an application might be using a distributed key-value store, a row- or column-order store, a set store, a triples store or some other technology...
The BI community represents only one slice of the Big Data user pie.
The piece that represents the Linked Data / Web 3.0 / Semantic Web community isn't as large, but that community is growing. In March 2010, Oxford University and the University of Southampton announced a new Institute for Web Science will lead the way in Web 3.0 development with ?30 million in funding from the UK government...
Cassandra, Hadoop Map/Reduce, Greenplum and other engines come up frequently in discussions about Big Data. But if Sir Tim Berners-Lee has his way, we'll be having more discussions about solutions for Really Big Data.
The W3C Resource Description Framework (RDF) defines a triples data model that's gained acceptance for Semantic Web applications, Linked Data and building out Web 3.0. There are a variety of data stores capable of handling billions of RDF triples, including OpenLink Virtuoso, Ontotext BigOWLIM, AllegroGraph, YARS2, and Garlik 4store. Raytheon BBN Technologies has approached the triples store problem from the perspective of using a cloud-based technology known as SHARD (Scalable, High_Performance, Robust and Distributed).
SHARD is a distributed data store for RDF triples that supports SPARQL queries. It's based on Cloudera Hadoop Map/Reduce and it's been deployed in the cloud on Amazon EC2. SHARD uses an iterative process with a MapReduce operation executed for each single clause of a SPARQL query. According to Kurt Rohloff, a researcher at Raytheon BBN, SHARD performs better than current industry-standard triple-stores for datasets on the order of a billion triples..."
http://www.drdobbs.com/blog/archives/2010/04/movement_on_the.html
"The maturity of SVG allows for a little-known style of use and development of currently undocumented visual elements. In a time when data-as-a-service is blossoming, it makes a lot of sense to script SVG instances from an enclosing Web application. A specific example of a dynamic choropleth illustrates how easy this technique can be... This article will interest both Web developers and their managers. While the coding is simple enough to be thoroughly understood, it models a GUI effect that goes beyond traditional form-based Web application.
The effect: (1) Depends only on public standards; (2) Performs at least as well as proprietary alternatives; (3) Opens up new models of teamwork and collaboration; (4) Represents an implementation technique that has apparently never before been documented explicitly...
The article has three aims: to demonstrate the operation of a specific standards-compliant Web-based GUI effect for a user perspective; to explain the model as an example of deep partnership between Web and SVG technologies from a developer's perspective; to illustrate how
HTML5 promotes new divisions of labor in development of complex Web applications, as team leads will want to know. It is written for those who can effectively read the HTML and JavaScript typical of Web pages, but not necessarily write these languages fluently. No prior experience with SVG is necessary. Familiarity with XML is required.
While there is no dependence in the presentation on a particular operating system, exposure to a range of browsers is expected...
Until recently, though, dynamic graphical interaction was largely the domain of proprietary or desktop applications. As this article shows, SVG now makes a good foundation for advanced standards-based effects with Web applications. SVG and, more generally, the collection of capabilities abbreviated as HTML5, open Web application development to dramatic new frontiers... As data as a service takes off, SVG will enable a wide range of visualizations which will be conveniently de-coupled from particular content. In the early days of the Web, the only facilities which provided the possibility of interaction of the sort this article illustrates were image maps and proprietary plugins.
The former were notoriously tedious to write and even harder to maintain. SVG, in contrast, acts as a higher-level language for development of a wide range of graphical effects.
Bio Note: "For long-time developer Cameron Laird, 'graphics' has, on different occasions, meant everything from images of brains scans and aircraft cockpit displays to pipeline control boards and commodity market charts. He's an enthusiastic Invited Expert to the W3C SVG Interest Group precisely because SVG promises to relieve so many problems common to these diverse domains. Cameron also is a prolific author of articles for developerWorks and other professional publications, as well as the Smart Development blog. Cameron consults as vice-president of Phaseit, Inc., which he co-founded..."
http://www.ibm.com/developerworks/library/x-svgclientside/
See also the W3C Scalable Vector Graphics (SVG) web site: http://www.w3.org/Graphics/SVG/
W3C has published a First Public Working Draft for "Emotion Markup Language (EmotionML) 1.0." The document was developed by the W3C Multimodal Interaction Working Group, and the Working Group expects to advance this Working Draft to Recommendation Status. The draft specification draws on previous work in the Emotion Markup Language Incubator Group which proposed elements of a generally usable markup language for emotions and related states, as well as on the earlier Emotion Incubator Group (2006-2007) which had identified a comprehensive list of requirements arising from use cases of an Emotion Markup Language... The group expects a process of condensing this document into a simpler, ready-to-use specification which removes unclear parts of the draft and cuts redundancy with related languages such as 'EMMA: Extensible MultiModal Annotation Markup Language'.
"Human emotions are increasingly understood to be a crucial aspect in human-machine interactive systems. Especially for non-expert end users, reactions to complex intelligent systems resemble social interactions, involving feelings such as frustration, impatience, or helplessness if things go wrong. Furthermore, technology is increasingly used to observe human-to-human interactions, such as customer frustration monitoring in call center applications. Dealing with these kinds of states in technological systems requires a suitable representation, which should make the concepts and descriptions developed in the affective sciences available for use in technological contexts...
The present draft specification of Emotion Markup Language 1.0 aims to strike a balance between practical applicability and scientific well-foundedness. The language is conceived as a 'plug-in' language suitable for use in three different areas: (1) manual annotation of data; (2) automatic recognition of emotion-related states from user behavior; and (3) generation of emotion-related system behavior..."
http://www.w3.org/TR/2009/WD-emotionml-20091029/
See also the W3C Multimodal Interaction Activity: http://www.w3.org/2002/mmi/
"The idea of specific video and audio tags within HTML would have been technically impossible in HTML 3 and even somewhat infeasible in HTML Version 4. Because HTML 4.0 essentially was a 'frozen' version, the specific mechanism for displaying content has been very much format dependent (e.g., Apple QuickTime Movies and Flash video) and usually relies upon tags with varying parameters for passing the relevant information to the server. As a result, video and audio embedding on web pages has become something of a black art...
Its perhaps not surprising then that the 'audio' and 'video' tags were among the first features to be added to the HTML 5 specification, and these seem to be the first elements of the HTML 5 specification that browser vendors implemented. These particular elements are intended to enable the browser to work with both types of media in an easy-to-use manner. An included support API gives users finer-grained control...
Theoretically, the 'video' and 'audio' elements should be able to handle most of the codecs currently in use. In practice, however, the browsers that do currently support these elements do so only for the open source Ogg Vorbis and Theora standards. The names may not be familiar to you...
The Ogg Vorbis standard is both open source and high fidelity, compared with the better-known MPEG formats. As such, Ogg Vorbis is a popular format for storing audio tracks for games and online applications. The HTML 5 specification does not give any preference to Ogg Vorbis/Theora over other formats, but it is the one supported by Firefox (exclusively, at this point). The Chrome and Safari teams both have announced intentions to support the two standards in addition to others...
By all indications, the browser vendors view the multimedia aspects as perhaps the most crucial in the developing HTML 5 standard. Given the complexity of both types of media and the prospect of being able to better promote video and audio usage within web browsers, it's hardly surprising. However, before HTML 5 multimedia becomes ubiquitous, the state of the art for these browser implementations still has a ways to go technically..."
http://www.devx.com/webdev/Article/43324
See also the video element in an HTML 5 draft: http://dev.w3.org/html5/spec/Overview.html#video
"XML is SGML for the Web, but it hasn't made as big a splash on the Web as the XML community would like. The most prominent effort for XML on the Web, XHTML, has been dogged by politics and design-by-committee, and other ambitious, technically sound specs such as XForms and SVG have struggled with slow uptake. The success of XML on the Web has come in sometimes unexpected directions, including the popularity of Web feeds, XML formats such as the RSS flavors and Atom...
Many dynamic HTML developers are tired of the cross-browser pain and scripting quirks across browser. The emergence of several excellent JavaScript libraries makes life easier for developers. One of the most popular of these libraries is jQuery, which has been covered in several articles here on developerWorks. You can also use jQuery to process XML, if you learn how to drive around the monster potholes. This article shows how to use jQuery to process the Atom Web feed format. Web feed XML is perhaps the most pervasive XML format around, and the main fulfillment of the promise of XML on the Web. But most such formats use XML namespaces, which cause issues with many popular JavaScript libraries, including jQuery.
jQuery is all about packaging up all the tricks and workarounds for dealing with Web browser oddities, and the XML workbench introduced in this article is a first step towards such a reusable tool for those needing to deal with XML. You see how one of the biggest problems is dealing with namespaces. Once you get past that hurdle, jQuery gives you the tools to deal with the many sorts of irregular documents so aptly expressed with XML. You'll discover how readily the techniques developed processing Web feeds can be applied to many other XML formats within the browser. If you find jQuery and attendant workarounds unsuitable, another option is to use a JavaScript library more directly targeted at XML processing, such as Sarissa, which is worth an article of its own, but is not as widely used, nor as easy to deploy as jQuery.
http://www.ibm.com/developerworks/xml/library/x-feedjquery/index.html
See also jQuery resources: http://jquery.com/
"If you want to see the scale of browser makers' ambition to remake not just the Web but computing itself, look no farther than a new 3D technology called WebGL... WebGL, while only a nascent attempt to catch up, is real. WebGL now is a draft standard for bringing hardware-accelerated 3D graphics to the Web. It got its start with Firefox backer Mozilla and the Khronos Group, which oversees the OpenGL graphics interface, but now the programmers behind browsers from Apple, Google, and Opera Software are also involved.
Perhaps more significant than formal standards work, though, is WebGL support in three precursors of today's browsers -- Minefield for Mozilla's Firefox, WebKit for Apple's Safari, and Chromium for Google's Chrome. Opera has started implementing WebGL...
WebGL is one of a handful of efforts under way to boost the processing power available to Web applications. It marries two existing technologies.
First is JavaScript, the programming language widely used to give Web pages intelligence and interactivity. Although JavaScript performance is improving relatively quickly these days in many browsers, programs written in the language are relatively pokey and limited compared with those that run natively on a computer. Second is OpenGL ES, a 2D and 3D graphics interface for devices such as phones or car navigation systems with limited horsepower. If a computer's graphics system has an OpenGL driver, software written to use OpenGL can tap directly into the graphics system's hardware acceleration. WebGL links these two so JavaScript programs can call upon 3D abilities, with the HTML5 technology also under development acting as glue..."
From the specification: "The WebGL Specification describes an additional rendering context and support objects for the HTML 5 canvas element.
This context allows rendering using an API that conforms closely to the OpenGL ES 2.0 API... The HTMLCanvasElement places an element on the page into which graphic images can be rendered using a programmatic interface.
Currently the only such interface described is the CanvasRenderingContext2D.
This document describes another such interface, WebGLRenderingContext, which presents and API derived from the OpenGL ES 2.0 specification.
This API provides a rich set of functions allowing realistic 3D graphics to be rendered..."
http://news.cnet.com/8301-30685_3-10416966-264.html
See also the WebGL Specification: https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/WebGL-spec.html
Members of the W3C Cascading Style Sheets (CSS) Working Group now invite implementation of two Candidate Recommendation specifications: "CSS Multi-column Layout Module" and "CSS Backgrounds and Borders Module Level 3."
CSS is a language for describing the rendering of structured documents (such as HTML and XML) on screen, on paper, in speech, etc. It is playing an important role in styling not just HTML, but also many kinds of XML
documents: XHTML, SVG (Scalable Vector Graphics) and SMIL (the Synchronized Multimedia Integration Language), to name a few. It is also an important means of adapting pages to different devices, such as mobile phones or printers...
"CSS Multi-column Layout Module" describes how content can be flowed into multiple columns with a gap and a rule between them. On the Web, tables have also been used to describe multi-column layouts. The main benefit of using CSS-based columns is flexibility; content can flow from one column to another, and the number of columns can vary depending on the size of the viewport. Removing presentation table markup from documents allows them to more easily be presented on various output devices including speech synthesizers and small mobile devices... This document will remain at Candidate Recommendation stage at least until 17-June-2010. For this specification to be proposed as a W3C Recommendation, the following conditions shall be met: there must be at least two independent, interoperable implementations of each feature. Each feature may be implemented by a different set of products, so there is no requirement that all features be implemented by a single product...
"CSS Backgrounds and Borders Module Level 3" presents the features of CSS level 3 relating to borders and backgrounds. It includes and extends the functionality of CSS level 2, which builds on CSS level 1. The main extensions compared to level 2 are borders consisting of images, boxes with multiple backgrounds, boxes with rounded corners and boxes with shadows. When elements are rendered according to the CSS box model, each element is either not displayed at all, or formatted as one or more rectangular boxes. Each box has a rectangular content area, a band of padding around the content, a border around the padding, and a margin outside the border. The margin may actually be negative, but margins have no influence on the background and border... This document will remain at Candidate Recommendation stage at least until 17-June-2010.
http://www.w3.org/News/Public/pnews-20091221
See also the W3C Cascading Style Sheets (CSS) Working Group: http://www.w3.org/Style/CSS/members
"[One approach to adding new functionality to web sites] may have the potential to both simplify your applications and contribute significantly to reuse. The idea behind it is deceptively simple: in a web page's CSS page, you define what's called a behavior, a script that binds to a given behavior language document written in mixed XML and java-script called the XML Binding Language (XBL). Once the page loads, any element that's associated with that particular rule will gain the behavior, essentially acting as a new 'element' with its own presentation, its own responses to user input, and its own underlying data. XBL has been floating around in various incarnations since the early 2000s. Microsoft created a type of binding called behaviors in the late 1990s, but the technology never really caught on with other browsers...
Mozilla announced in 2008 that they were looking to have an XBL 2 version likely with their 4.0 release ... this effort is underway now, and it's very likely that they will achieve this goal, especially with a formal candidate recommendation status that's unlikely to change its underlying functionality. Google, for it's part, took an alternative route that's similar to the approach they took recently with the SVG Web project.
Rather than waiting for other browser vendors to adopt XBL 2, they've recently created a java-script based XBL2 code project [...] with the code designed in such a way that it will work across any browser.
Currently it supports all major browser versions produced within the last four years...
It's hard to say whether Google's XBL2 implementation will really catch on, although it has a number of factors going for it. The code is remarkably cross platform -- it works on all contemporary browsers with the possible exception of Konquerer. That doesn't necessarily mean that java-script code written within the bindings will satisfy that same restriction, of course, but having the framework in place can go a long way to making such code browser independent.
Bindings make for cleaner layout and code and encourages componentization at the browser level, which in turn promotes code reuse and the development of core libraries... Overall, it may very well be that XBL's time is just now arriving. With the stabilizing of the AJAX space, and the resources of a company like Google behind it, XML as a binding language has a great deal to offer and very little downside..."
http://www.devx.com/webdev/Article/43595
See also Cross-browser XBL 2 implementation in JavaScript: http://code.google.com/p/xbl/
All browser makers should take a page from Google's Chrome and isolate untrusted data from the rest of the operating system, a noted security researcher said: Dino Dai Zovi, a security researcher and co-author of The Mac Hacker's Handbook, believes that the future of security relies on 'sandboxing,' the practice of separating application processes from other applications, the operating system and user data... He sees browser sandboxing as an answer to the flood of exploits that have overwhelmed users in the past year.
In a blog article, Dai Zovi described sandboxing, as well as the lesser security technique of "privilege reduction," as "moving the bull (untrusted data) from the china shop (your data) to the outside where it belongs (a sandbox)." The idea behind sandboxing is to make it harder for attackers to get their malicious software onto machines. Even if an attacker was able to exploit a browser vulnerability and execute malware, he would still have to exploit another vulnerability in the sandbox technology to break into the operating system and, thus, get to the user's data...
Chrome has included sandboxing since its September 2008 debut. And while Dai Zovi considers it easily the leader in security because of that, other browser have, or will, make their own stabs at reducing users'
risks. For example, Microsoft's Internet Explorer 7 (IE7) and IE8 on Vista and Windows 7 include a feature dubbed "Protected Mode," which reduces the privileges of the application so that it's difficult for attackers to write, alter or destroy data on the machine, or to install malware. But it's not a true sandbox as far as Dai Zovi is concerned.
Currently, Mozilla's Firefox, Apple's Safari and Opera Software's Opera lack any sandboxing or privilege reduction features..."
http://www.computerworld.com/s/article/9143518/Chrome_sets_browser_security_standard_says_expert
See also Ian Hickson's MIME-type proposal for 'text/sandboxed-html': http://lists.w3.org/Archives/Public/public-html/2010Jan/thread.html#msg490
"A difference of opinion among developers has become a high-profile debate over the future of the Web: should programmers continue using Adobe Systems' Flash or embrace newer Web technology instead? The debate has gone on for years, but last week's debut of Apple's iPad -- which like the iPhone doesn't support Flash -- turned up the heat. Before that, Adobe had been saying with some restraint that it's happy to bring Flash to the iPhone when Apple gives the go-ahead. But Chief Technology Officer Kevin Lynch took the gloves off Tuesday with a blog post that said Apple's reluctance to include Flash on its 'magical device' means iPad buyers will effectively see a crippled Web. And he played the Google Nexus One card, too...
Flash has indeed spread to near-ubiquity on computers, with better than
98 percent penetration, according to Adobe's statistics. Its roots lay with graphical animations, but its success was cemented by providing an easy streaming video mechanism to a Web that had been plagued with obstreperous and incompatible technology from Microsoft, Apple, and Real. But a collection of new technologies -- including a rejuvenated HTML (Hypertext Markup Language) standard used to write Web pages -- are aiming to reproduce some of what Flash offers.
Bruce Lawson, Web standards evangelist for browser maker Opera Software, believes HTML and the other technologies inevitably will replace Flash and already collectively are "very close" to reproducing today's Flash abilities. It's not just a matter of the installed base of Flash on the Web, though. Although HTML5 and its associated technologies are maturing rapidly, and because they evolve concurrently with browser support, they're arriving and relevant now even though incomplete...
After years of HTML standardization disarray, browser makers Apple, Opera, Mozilla, and most recently Google now are hammering out new directions for Web standards. Perhaps the most visible HTML5 aspect is built-in support for audio and video, but there are other HTML abilities under way: storing data on a computer for use by an application, Web Sockets for periodically pushing updates to a browser, Web Workers for letting Web programs perform multiple tasks at once, and Canvas for better two-dimensional graphics. At the same time, these allies marching under the "Open Web" banner also are creating new standards such as WebGL for accelerated 3D graphics on the Web, enabling better typography through CSS (Cascading Style Sheets) and Web fonts, beefing up support for others including SVG (Scalable Vector Graphics), and improving the power of JavaScript for writing Web-based programs..."
http://news.cnet.com/8301-30685_3-20000037-264.html
See also the draft HTML 5 specification: http://dev.w3.org/html5/spec/Overview.html
The W3C XML Core Working Group has published a Proposed Recommendation for "XML Linking Language (XLink) Version 1.1", together with an "Implementation Report for XLink 1.1" which lists preliminary implementation feedback about XLink 1.1 implementations. Though the previous version of this document was a Last Call Working Draft, there was an earlier Candidate Recommendation version of this document that has already resulting in successful implementation feedback.
Given that the changes to this draft do not affect the validity of that earlier implementation feedback, the Working Group is now publishing this version as a Proposed Recommendation. The review period ends on 31-March-2010.
This specification implements all of the XLink 1.1 requirements documented in the Working Group Note "Extending XLink 1.0". These changes make XLink more useful in the places where it is already being used and make it practical in a variety of similar vocabularies.
Proposed Changes in the Note included making simple XLinks an application-level default (any element with an 'xlink:href' attribute that does not specify a link type should be treated as a simple link), explicitly reserving all attributes in the XLink namespace, supporting IRIs, and providing sample non-normative XML Schema and RELAX NG Grammars.
XML Linking Language (XLink) Version 1.1 allows elements to be inserted into XML documents in order to create and describe links between resources. It uses XML syntax to create structures that can describe links similar to the simple unidirectional hyperlinks of today's HTML, as well as more sophisticated links. An important application of XLink is in hypermedia systems that have hyperlinks.
A simple case of a hyperlink is an HTML A element, which has these
characteristics: (1) The hyperlink uses IRIs as its locator technology;
(2) The hyperlink is expressed at one of its two ends; (3) The hyperlink identifies the other end -- although a server may have great freedom in finding or dynamically creating that destination; (4) Users can initiate traversal only from the end where the hyperlink is expressed to the other end; (5) The hyperlink's effect on windows, frames, go-back lists, style sheets in use, and so on is determined by user agents, not by the hyperlink itself. For example, traversal of 'A'
links normally replaces the current view, perhaps with a user option to open a new window. This set of characteristics is powerful, but the model that underlies them limits the range of possible hyperlink functionality. The model defined in this specification shares with HTML the use of IRI technology, but goes beyond HTML in offering features, previously available only in dedicated hypermedia systems, that make hyperlinking more scalable and flexible. Along with providing linking data structures, XLink provides a minimal link behavior model; higher-level applications layered on XLink will often specify alternate or more sophisticated rendering and processing treatments..."
http://www.w3.org/TR/2010/PR-xlink11-20100225/Overview-diffCR.html
See also the W3C XML Core Working Group: http://www.w3.org/XML/Core/
W3C's Mobile Web Test Suites Working Group has just released a brand new Web Compatibility Test for Mobile Browsers. Based on the same idea of evaluating support of a number of Web technologies at a glance as in the first Web Compatibility Test published in July 2008, this second version features a number of more recent technologies that promise to make Web browsers more powerful, in particular on mobile
A blog article from Kai Hendry notes: "Go and test your mobile with the new test, and if your browser scores a 110% you are cheating... In this fresh forward looking 2.0 test we hope to encourage key technologies that will make the mobile platform simply rock. Of course we have the usual suspects like AJAX support and canvas which were tested in the WCTMB v1 test too. However we gear up by checking for Geolocation support which is very relevant to mobile users and for various helpful offline technologies like application cache and Web storage. These offline technologies help the Web in areas where Internet may be unreliable, which is a lot of places on most mobile devices!
We also make a daring leap into the fray to ask for support of video and audio, which is quite demanding on a mobile device. We allow for all sorts of codecs, though midi files and animated gifs won't pass. :) We also test for new input types, rich text editing and font face support which could be a workaround where phones have a poor font, for instance for a particular locale. No matter where you are from or what language you speak, we hope to entangle you in the Web with any device to hand..."
The W3C Mobile Web Initiative is focusing on developing best practices for mobileOK Web sites and Web applications, device information needed for content adaptation, test suites for mobile browsers, and marketing and outreach activities... While becoming increasingly popular, mobile Web access today still suffers from interoperability and usability problems. W3C's Mobile Web Initiative addresses these issues through a concerted effort of key players in the mobile production chain, including authoring tool vendors, content providers, handset manufacturers, browser vendors and mobile operators. With mobile devices, the Web can reach a much wider audience, and at all times in all situations. It has the opportunity to reach into places where wires cannot go, to places previously unthinkable (e.g., providing medical information to mountain rescue scenes) and to accompany everyone as easily as they carry the time in their wristwatches. Moreover, today, many more people have access to mobile devices than access to a desktop computer. This is likely to be very significant in developing countries, where Web-capable mobile devices may play a similar role for deploying widespread Web access as the mobile phone has played for providing POTS..."
http://www.w3.org/2010/01/wctmb2/
See also the blog article: http://www.w3.org/2005/MWI/Tests/blog/2010/02/09/wctmbv2
Microsoft has issued an official blog posting on March 2, 2010 detailing its engineers' thinking process behind the build of Internet Explorer 8, the latest version of its Web browser, and how it chooses to render certain Websites. According to the blog posting, some 19 percent of high-traffic Websites currently render in IE 8 standards. Microsoft is working to reduce the list of Websites that IE 8 needs a feature called Compatibility View in order to render all elements properly... The number of Websites that require IE 8's compatibility has apparently declined from 3,100 to more than 2,000 in the last 12 months, according to Microsoft.
The blog article "How IE8 Determines Document Mode" was authored by Marc Silbey. Internet Explorer Program Manager. Excerpts: "This post describes how IE8 determines what Document Mode such as Quirks or Standards Modes to use for rendering websites. This topic is important for site developers and consumers. It's related to the Compatibility View List that we recently updated. This list is down by over 1000 websites, from over 3100 to just over 2000, since IE8 released last March. As we work with site developers and standards bodies, we're excited to see the sites that need to be on the Compatibility View (CV) List continue to go down...
When looking at the doctype and X-UA-Compatible meta tag and header on thousands of high traffic websites worldwide such as qq.com, netlog.com and those on the initial CV List, (1) 26% specify Quirks such as amazon.com, tworld.co.kr, and unibanco.com.br; (2) 41% specify a Transitional doctype that puts them in Almost Standards Mode; (3) 14% have already added an X-UA-Compatible meta tag or HTTP response header to render in IE7 Standards Mode...
Compatibility and interoperability are complex. To reduce complexity for developers and users alike, we would love to see websites transition from legacy browser modes. We respect that the choice of mode is up to the site developer. We're excited to work with sites and standards bodies to continue improving IE's implementation of interoperable standards..."
http://www.eweek.com/c/a/Windows/Microsoft-Making-IE-8-Fully-Compatible-With-More-Web-Sites-364265/
See also Marc Silbey's blog article: http://blogs.msdn.com/ie/archive/2010/03/02/how-ie8-determines-document-mode.aspx
This brief article provides an example of working with the Representational State Transfer style of software architecture. REST (Representational State Transfer) is a style of software architecture for accessing information on the Web. The RESTful service refers to web services as resources that use XML over the HTTP protocol. The term REST dates back to 2000, when Roy Fielding used it in his doctoral dissertation. The W3C recommends using WSDL 2.0 as the language for defining REST web services. To explain REST, we take an example of purchasing items from a catalog application...
First we will define CRUD operations for this service as following. The term CRUD stands for basic database operations Create, Read, Update, and Delete. In the example, you can see that creating a new item with Id is not supported. When a request for new item is received, Id is created and assigned to the new item. Also, we are not supporting the update and delete operations for the collection of items. Update and delete are supported for the individual items...
Interface documents: How does the client know what to expect in return when it makes a call for CRUD operations? The answer is the interface document. In this document you can define the CRUD operation mapping, Item.xsd file, and request and response XML. You can have separate XSD for request and response, or response can have text such as 'success'
in return for the methods other than GET...
There are other frameworks available for RESTful Services. Some of them are listed here: Sun reference implementation for JAX-RS code-named Jersey, where Jersey uses a HTTP web server called Grizzly, and the Servlet Grizzly Servlet; Ruby on Rails; Restlet; Django; Axis2a.
http://www.devx.com/architect/Article/44341
W3C has created a new public group 'Web Fonts Working Group' as part of the W3C Fonts Activity. The initial Chair is Vladimir Levantovsky of Monotype Imaging. The mission of the Web Fonts Working Group is to develop specifications that allow the interoperable deployment of downloadable fonts on the Web. Existing specifications (CSS3 Fonts, SVG) explain how to describe and link to fonts, so the main focus will be the standardisation of font formats suited to the task, and a specification defining conformance (for fonts, authoring tools, viewers ...) covering all the technology required for WebFonts.
As the relevant specifications are all implemented, and either standardised (OpenType by ISO/IEC 14496-22:2009, SVG by the SVG 1.1
Recommendation) or mature (WOFF, EOT, CSS3 Fonts) the group would be chartered to only make the minimal changes needed for interoperability and standardisation. In addition, the provision of interoperable font formats would allow the testing of CSS3 Fonts, speeding it to Recommendation status.
One Recommendation-track Deliverable (candidate) is the "WOFF File Format" specification. This document specifies a simple compressed file format for fonts, designed primarily for use on the web. The WOFF format is directly based on the table-based sfnt structure used in TrueType, OpenType, and Open Font Format fonts, which are collectively referred to as sfnt-based fonts. A WOFF font file is simply a repackaged version of a sfnt-based font in compressed form. The format also allows font metadata and private-use data to be included separately from the font data. WOFF encoding tools convert an existing sfnt-based font into a WOFF formatted file, and user agents restore the original sfnt-based font data for use with a webpage.
A WebFont conformance specification is also expected. It will reference the font formats in existing use (OpenType, WOFF, SVG, and EOT), the font referencing and linking specifications (in both CSS and XML serialisations), access policies such as same-origin and CORS, and define which linking mechanisms, policies and formats are required for compliance. WOFF will be the required format for compliance, the others being optional. The Working Group will decide whether to make the formats and linking mechanisms normative references or, on the other hand, produce a document citable by other specifications (CSS3 Fonts, XSL, SVG) when claiming conformance..."
http://www.w3.org/2009/08/WebFonts/charter.html
See also Fonts on the Web: http://www.w3.org/Fonts/
"The W3C 'Widget Packaging and Configuration' specification is an emerging specification for configuring, packaging, and deploying widgets.
W3C widgets are components that are made up of HTML, cascading style sheets (CSS), JavaScript files, and other resources such as images. You can use widgets in devices for small applications such as calendars, weather reports, chats, and more.
One advantage of using widgets rather than normal Web applications is that they can be downloaded once and used many times after that, just like non-Web applications that are installed on a device. This allows users to save on bandwidth because the only data they transfer is the data used by the widget and not the widget files themselves. Widgets often provide a rich user experience, such as interactive calendars and even games. You can use widgets in mobile devices, where the advantage of downloading the widget once and using it over and over can save on data transfer costs.
As of January 2010, the W3C 'Widget Packaging and Configuration'
specification is in candidate recommendation state. This means that the W3C believes the specification is in a stable state and encourages developers to create implementations of the specification. The goal of the W3C widget specification is to propose a standard method for building and packaging widgets. There are currently many different vendors that have widgets, and almost all of them implement their own proprietary application program interface (API) and packaging format.
This article introduces the W3C Packaging and Configuration specification, showing you how you can package HTML, CSS, and JavaScript files into a widget that can be deployed to a device that implements the W3C widget specification. Because this is an emerging specification, the implementation choices for devices that render the widgets are limited.
If you want to see the widgets in action, you need to download some extra applications if you don't already have them installed..."
http://www.ibm.com/developerworks/web/library/wa-w3cwidget/
See also Widget Packaging and Configuration: http://www.w3.org/TR/widgets/
In a large organization with complex analysis, modeling, and development initiatives spread across multiple projects, standardizing business semantics is key. Without a way to standardize the meanings and definitions of business concepts, each analysis, modeling, or development thread will naturally establish its own semantics. These disparate semantics can compound the already fragmented understanding of the relationship between IT assets and the business concepts they support.
For example, the business side of the house might clearly define the term Customer Tax Status. This enables each IT initiative that supports Customer Tax Status to use the defined meaning, which drives consistency of term name, definition, and related semantics across all the IT initiatives. By contrast, in the absence of such a structure, each IT initiative might naturally come to its own conclusion as to what Customer Tax Status means and how it should be defined. This can result in multiple structures, such as Customer Tax Code, Tax Status, Customer Code, all of which loosely imply the same semantics but differ in name and definition...
InfoSphere Business Glossary provides a means to specify business concepts and to manage the relationship among those concepts and the IT structures that support them. However, this content is only useful if it is easy to access. For example, without immediate and efficient access to glossary content, model users, including service analysts, component designers, and logical data modelers, might ignore the glossary and define their own terms. The glossary content should be available within the modeling tools, making the content impossible for the modeler to ignore. Still, there might be complications with model interchange and synchronization as relationships between model structures and glossary terms must be retained as models flow from tool to tool...
These new functions within the modeling platform fundamentally change the capability of an enterprise to define and control business semantics across various modeling domains. These techniques, properly applied, can greatly reduce the variation in business definitions across modeling efforts, across projects, and across line-of-business boundaries..."
http://www.ibm.com/developerworks/data/library/techarticle/dm-1003infospheremodelingtools/index.html
"As the Internet continues to evolve, Semantic Web technologies are beginning to emerge, but widespread adoption is likely to still be two to three years out... The web as originally conceived was largely static -- web content, once posted, usually didn't change significantly.
However, by 2010, the vast majority of content that is developed on the web falls more properly into the realm of messages rather than documents -- Facebook and Twitter notifications, resources generated from rapidly changing databases, documents in which narrative content are embedded within larger data structures, RSS and Atom feeds, KML (ironically, Google Earth and Google Maps) documents and so forth.
Thus, a URL no longer contains a static narrative -- it contains a constantly changing message.
Document enrichment by itself is of only moderate utility -- you are simply adding attributes to html elements to identify the category of a given word. With CSS, for instance, you could highlight the matched terms by category, visually showing place names compared to personal name. However, such enrichment gains more power when these XML documents are processed afterwards -- you can pull categories out and add them to a general list of categories for the resource in question, you could create links to specific content such as Wikipedia or the online Physicians Desktop Reference.. There are currently three types of formats used for document enrichment. The first is essential a proprietary or ad hoc standard -- the DE vendor provides a formal taxonomy system and method for embedding the formats within the initial text sample. The next approach (and one that is actually diminishing in
use) is that of microformats: using an agreed upon standard taxonomy for certain domains, such as document publishing (Dublin Core), friendship relationships (Friend of a Friend, or FOAF), address books (vCard), geocoding information (geo) and so forth. The problem with microformats is that they don't always work well in conjunction, and there's no way of encoding deeper relational information via most microformats.
This latter issue lays at the heart of the Resource Description Framework for Attributes, or RDFa, which makes it possible to encode relational information about different resources and resource links. RDFa is actually a variant of the much older W3C RDF language first formulated in 1999, then significantly updated in 2004. With RDFa, you can define multiple distinct categories (also known as name-spaces) with terms in each category... You can also establish relationships between different parts of a document by using RDF terminology -- for instance, indicating that a given introductory paragraph provides a good abstract summary "about" the document (or portion of a document) in question. There's even a specialized language called GRDDL that can take an RDFa encoded document and generate a corresponding RDF document. While comparatively few document enrichment companies have RDFa products on the market, many are moving in that direction, with organizations such as the BBC, NBC News, Time Inc. and Huffington Post among many others now exploring RDFa as a means of encoding such categorization information in the stories that are posted online...
These are progressive technologies -- XML technologies are now about a decade old, XQuery and XML Database tools are just now really becoming main stream. Semantic Web technologies are beginning to emerge, but widespread adoption is likely to still be two to three years out. However, publishing and journalism are definitely at the forefront of that curve, because these areas in particular are most sensitive to the need to both provide enjoyable news content and the need to make such stories manipulatable and discoverable within the ever increasing sophistication and scope of the web itself. The narrative thread has become a rich, interwoven tapestry, illuminated by brilliant strands of meaning, semantics and abstraction, turning our writings into conversations, and from there into dialogs..."
http://www.devx.com/semantic/Article/44232
See also the W3C Semantic Web: http://www.w3.org/standards/semanticweb/
"Support for the next generation of HTML is already appearing in today's browsers and Web pages... Anticipation is mounting for HTML5, the overhaul of the Web markup language currently under way at the Worldwide Web Consortium (W3C). For many, the revamping is long overdue. HTML hasn't had a proper upgrade in more than a decade... Many claim the HTML and XHTML standards have become outdated, and that their document-centric focus does not adequately address the needs of modern Web applications.
HTML5 aims to change all that. When it is finalized, the new standard will include tags and APIs for improved interactivity, multimedia, and localization. As experimental support for HTML5 features has crept into the current crop of Web browsers, some developers have even begun voicing hope that this new, modernized HTML will free them from reliance on proprietary plug-ins such as Flash, QuickTime, and Silverlight.
But although some prominent Web publishers -- including Apple, Google, the Mozilla Foundation, Vimeo, and YouTube -- have already begun tinkering with the new standard, W3C insiders say the road ahead for
HTML5 remains a rocky one. Some parts of the specification are controversial, while others have yet to be finalized. It may be years before a completed standard emerges and even longer before the bulk of the Web-surfing public moves to HTML5-compatible browsers. In the meantime, developers face a difficult challenge: how to build rich Web applications with today's technologies while paving the way for a smooth transition to HTML5 tomorrow.
Standards bodies by their very nature move slowly, but work on HTML5 is being driven by large, motivated vendors, including Adobe, Apple, Google, Microsoft, the Mozilla Foundation, Opera Software, and others. These companies recognize the need for an upgrade to the HTML standard, and their work is helping to realize its potential. The resulting opportunities for Web developers are too compelling to ignore..." [Note:
the Editor's draft version of HTML5 supports three different views (Normal view, Hide UA text, Highlight UA text) via radio-button selection.]
http://www.infoworld.com/print/115611
See also the HTML5 draft specification: http://dev.w3.org/html5/spec/Overview.html
"At the International Semantic Web Conference, being held this week in Chantilly, Va., Dean Allemang, chief scientist at Semantic Web consulting firm TopQuadrant, offered a solid example of how a machine-readable Web would help us all, in theory anyway... W3C recently been promoting the idea of making the Web machine-readable, or a Web of data. What does that mean?
Allemang's example was work-related: booking hotels... Relational databases make the prospect feasible. With databases, you can structure data so each data element is slotted into a predictable location. You can query a database of personnel data to return a birth date of a particular person, because the row of data with that person's info has a dedicated column dedicated to the birth date. This approach wouldn't work so well for data beyond a single database...
The answer the W3C has come up with comes in a form of a set of interrelated standards, that can be used to embed data on Web sites, as well as to interpret the data that is found there. One standard is the Resource Description Framework. The other is the Web Ontology Language, or OWL... A query against Triple Store, which is what a RDF database is called, can link together disparate facts. If another triple, perhaps located in another Triple Store, contains the fact that Yellowstone contains the Mammoth Hot Springs, a single search across multiple Triple Stores can return both facts... In essence, with RDF, a user can build a set of data from various sources on the Web that may have not been brought together before... How do you use these triples? One way is through the query language for RDF, called SPARQL (an abbreviation for the humorously recursive SPARQL Protocol and RDF Query Language). With Structured Query language (SQL), you can query multiple database tables through the JOIN function. With a SPARQL query, you specify all the triples you would need, and the query engine will filter down to the answers that fit all of your criteria..."
http://gcn.com/blogs/tech-blog/2009/10/machine-readable-web.aspx
See also the ISWC 2009 Conference: http://iswc2009.semanticweb.org/
"The Internet Corporation for Assigned Names and Numbers appears poised to move forward on allowing Internationalized Domain Names, with a vote on the matter set for Friday [2009-10-30] at the organization's meeting in Seoul. For the last five years, ICANN has come under pressure to move away from the use of Web addresses written only in the Roman alphabet, so that users around the world can write Web addresses in their own languages and scripts. Some countries are impatient to adopt their own domain name systems for doing this, but such moves could fragment the Internet, making parts of it invisible to countries not using the same DNS. IDNs have been undergoing tests for the last three years and starting November 16, 2009, countries can apply to test country-code Top-Level Domains...
While [currently] some parts of a URL can be written in non-Latin languages, the country-code portion, such as .ru (Russia) or .jp (Japan), for instance, must use the Roman alphabet. Chinese, Arabic, Korean, Japanese, Greek, Hindi, Hebrew and Russian have been among the languages that cannot be used in a ccTLD or full e-mail address. For instance, a business card might be written in Korean, but the Internet domain and e-mail address were in English... Out of the 1.6 billion Internet users worldwide, 56% use languages that have scripts based on alphabets other than Latin, which was a catalyst in the IDN process..."
http://www.computerworld.com/s/article/9139989/
See also the Wall Street Journal: http://online.wsj.com/article/SB125664117322309953.html
Microsoft announced that it will "provide patent- and license-free use rights to the format behind its Outlook Personal Folders opening e-mail, calendar, contacts and other information to a host of applications such as antimalware or cloud-based services. Documenting and publishing the .pst format could open up entirely new feature sets for programs such as search tools for mining mailboxes for relevant corporate data, new security tools that scan .pst data for malicious software, or e-discovery tools for meeting compliance regulations...
The written documentation would explain how to parse the contents of the '.pst' file, which houses the e-mail, calendar and contact contents of Outlook Personal Folders. The documentation will detail how the data is stored, along with guidance for accessing that data from other software applications. The effort is designed to give programs the knowledge to read Outlook data stored on user desktops.
Full details are presented in the blog article of Paul Lorimer (Group Manager, Microsoft Office Interoperability) "Roadmap for Outlook Personal Folders (.pst) Documentation." Excerpt: "On desktops, this data is stored in Outlook Personal Folders... Developers can already access the data stored in the .pst file, using Messaging API (MAPI) and the Outlook Object Model -- a rich set of connections to all of the data stored by Outlook and Exchange Server -- but only if Outlook is installed on the desktop. In order to facilitate interoperability and enable customers and vendors to access the data in .pst files on a variety of platforms, we will be releasing documentation for the '.pst'
file format. This will allow developers to read, create, and interoperate with the data in .pst files in server and client scenarios using the programming language and platform of their choice.. it will be released under our Open Specification Promise, which will allow anyone to implement the .pst file format on any platform and in any tool, without concerns about patents, and without the need to contact Microsoft in any way. Designing our high volume products to enable such data portability is a key commitment under our Interoperability Principles, which we announced in early 2008. We support this commitment through our product features, documented formats, and implementation of standards..."
http://www.computerworld.com/s/article/9139968/
See also Lorimer's blog article: http://blogs.msdn.com/interoperability/archive/2009/10/26/roadmap-for-outlook-personal-folders-pst-documentation.aspx
This article reports on a Google proposal by Katharina Probst, Bruce Johnson, Arup Mukherjee, Erik van der Poel, and Li Xiao. "Google is proposing a new standard for making AJAX-based Web sites search-engine friendly. If adopted, the standard could mean developers no longer have to choose between site optimization and dynamic pages. It's been common knowledge for years that Web sites created using Asynchronous JavaScript and XML would not be crawled and indexed by search engines, often forcing Web developers to choose between search engine visibility and the dynamic features offered by AJAX. While AJAX-based Web sites are popular with users, search engines traditionally are not able to access any of the content on them...
Google also rolled out fitlered search options for smartphones powered by Android and Palm's webOS operating systems, as well as for iPhones.
Now mobile users can narrow searches using several critieria, including date ranges, forum posts and so on... now offers nine more Search Options filters -- including date ranges and options for more or fewer e-commerce sites. Search Options are found in the "show options" link, in the lightly shaded blue bar above the search results. As a result, Google now enables users to choose among the following: past hour, specific date range, more shopping sites, fewer shopping sites, visited pages, not yet visited, books, blogs and news..."
From the Google "Proposal for Making AJAX Crawlable," as reported in the blog post by John Mueller: "[Google proposes] a new standard for making AJAX-based websites crawlable. This will benefit webmasters and users by making content from rich and interactive AJAX-based websites universally accessible through search results on any search engine that chooses to take part. We believe that making this content available for crawling and indexing could significantly improve the web...
Some of the goals that we wanted to achieve with this proposal were:
(1) Minimal changes are required as the website grows; (2) Users and search engines see the same content -- no cloaking; (3) Search engines can send users directly to the AJAX URL -- not to a static copy;
(4) Site owners have a way of verifying that their AJAX website is rendered correctly and thus that the crawler has access to all the content... We are currently working on a proposal and a prototype implementation; feedback is very welcome..."
http://www.internetnews.com/dev-news/article.php/3842916/Google+Says+Have+Your+AJAX+and+SEO+Too.htm
See also the Google blog: http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html
"U.S. White House officials announced that the Federal Register is now available in a format that lets readers browse, reorganize, and electronically customize the publication's daily contents. Issues of the Federal Register in XML format are now available at federalregister.gov. The XML documents are aslo available at Data.gov and GPO.gov. XML is a machine readable form of text that can be manipulated to work with digital applications, allowing people to analyze its contents in various ways... In 2008, editions of the daily publication contained nearly 32,000 separate documents on nearly 80,000 pages; the register chronicles White House and agencies'
activities and proposed changes to federal regulations..."
From the announcement: "XML is a form of text that can be manipulated in virtually limitless ways with digital applications. For example, people who want to know about the workings of the Executive branch of the Federal Government no longer need to sift through the Federal Register in its traditional Department-by-Department and Agency-by-Agency format. In this new format users can rearrange the Federal Register's contents in personalized ways to match their particular interests. It is now possible, for example, to download the Federal Register and easily see what proposed actions might affect one's community or region, or what actions might have an impact on one's profession or business interests...
The transformation, undertaken by the Government Printing Office and the National Archives and Records Administration, vastly increases the Federal Register's usefulness to the American public and further opens the curtains on the inner workings of Government, a major goal of the Obama Administration... This paves the way for consumers, rather than Government officials to be in charge of deciding how to access critical information. The Government Printing Office and the Office of the Federal Register have accomplished a minor miracle in warp-speed time..."
http://gcn.com/articles/2009/10/07/federal-register-data.aspx
See also the announcement: http://federalregister.gov/documents/XML_Federal_Register.pdf
Members of the W3C HTML Working Group have published a First Public Working Draft for "HTML+RDFa: A Mechanism for Embedding RDF in HTML."
"RDFa is intended to solve the problem of machine-readable data in HTML documents. RDFa provides a set of HTML attributes to augment visual data with machine-readable hints. Using RDFa, authors may turn their existing human-visible text and links into machine-readable data without repeating content.
Today's web is built predominantly for human consumption. Even as machine-readable data begins to permeate the web, it is typically distributed in a separate file, with a separate format, and very limited correspondence between the human and machine versions. As a result, web browsers can provide only minimal assistance to humans in parsing and processing web data: browsers only see presentation information.
This specification defines rules and guidelines for adapting the 'RDF in XHTML: Syntax and Processing (RDFa)' specification for use in the HTML5 and XHTML5 members of the HTML family. The rules defined in this document not only apply to HTML5 documents in non-XML and XML mode, but also to HTML4 documents interpreted through the
HTML5 parsing rules..."
http://www.w3.org/TR/2009/WD-rdfa-in-html-20091015/
See also the W3C HTML Working Group: http://www.w3.org/html/wg/
"E-mail used to be the Internet's killer app. Mozilla's Raindrop software anticipates a world where email has been reduced to one channel among many... Just as Google Wave represents an attempt to imagine what email would look like if it were invented today, Mozilla's Raindrop represents an attempt to imagine a more modern communication client. Developed by the team that created Mozilla's Thunderbird e-mail client, Raindrop recognizes that the diverse range of communication channels -- Twitter, IM, Skype, Facebook, Google Docs, E-mail -- would be more useful if presented in a unified interface... As with other Mozilla open-source projects, the Raindrop team is encouraging interested developers to participate and contribute code to improve the project..."
From the Mozilla Labs web site: "Raindrop is a new exploration by the team responsible for Thunderbird to explore new ways to use open Web technologies to create useful, compelling messaging experiences.
Raindrop's mission: make it enjoyable to participate in conversations from people you care about, whether the conversations are in email, on twitter, a friend's blog or as part of a social networking site.
Raindrop uses a mini web server to fetch your conversations from different sources (mail, twitter, RSS feeds), intelligently pulls out the important parts, and allows you to interact with them using your favorite modern web browser (Firefox, Safari or Chrome). Raindrop comes with a built-in experience that bubbles up what conversations are important to you. You can participate in the experience by writing extensions that use standard open Web technologies like HTML, JavaScript and CSS. Or, use the lower level APIs to make your own experience. You have control over your conversations and how you want to play with them..."
http://www.informationweek.com/news/internet/web2.0/showArticle.jhtml?articleID=220900386
See also the Mozilla Labs Raindrop web site: http://labs.mozilla.com/raindrop
When Tim Berners-Lee, inventor of the World Wide Web, entered the room for the final interview at the Web 2.0 Summit, the audience stood up for him. Appropriately so, since most of those present owe their livelihoods to his invention. In an on-stage interview with Tim O'Reilly, the audience was listening to Berners-Lee not just for his perspective but his guidance... Here's what I heard:
TBL: (1) Don't build your laws into the Web. "Technology shouldn't tell you what's right and what's wrong... (2) Fault-tolerance is vital.
(3) If you want it everywhere, give it away. (4) Large companies/govt are the enemy: "I'm worried about anything large coming in to take control, whether it's large companies or government.." (5) Small open companies can topple big closed ones. (6) Separate design from device.
(7) Consider content as app. (8) Forge trust. (9) Make the Web work for more people...
http://news.cnet.com/8301-19882_3-10381726-250.html
See also Web 2.0 Summit 2009: http://www.web2summit.com/web2009
"Some seeds for overhauling Web browser graphics were planted more than a decade ago, and Google believes now is the time for them to bear fruit.
The company is hosting the SVG Open 2009 conference to dig into a standard called Scalable Vector Graphics (SVG) that can bring the technology to the Web. With growing support from browser makers, an appetite for vector graphics among Web programmers, and new work under way to make SVG a routine part of the Web, the technology has its best chance in years at becoming mainstream. New Web programming standards are hard to nurture, but they do arrive, said Brad Neuberg, a Google programmer and speaker at the conference...
Vector graphics describe imagery mathematically with lines, curves, shapes, and color values rather than the grid of colored pixels used by bitmapped file formats such as JPEG or GIF widely used on the Web today. Where appropriate, such as with corporate logos but not photographs, vector graphics bring smaller file sizes and better resizing flexibility. That's good for faster downloads and use on varying screen sizes. For one example, try the SVG version of the Wikipedia logo using the page-zoom tools in Firefox, Safari, Chrome, or Opera. It's a big SVG file, but it does scale. Another real-world
example: the illustrations in Google Docs use SVG..."
http://news.cnet.com/8301-30685_3-10365636-264.html
See also the W3C Scalable Vector Graphics (SVG) web site: http://www.w3.org/Graphics/SVG/
Earlier the author wrote: "perhaps the looming challenge for document standards is not in deciding or developing perfect formats, but in integrating the packaged world of documents with the fragmented world of web resources. Documents that can be websites."
More: "he most likely future for documents and their formats, is that each document will start to look/act/be implemented more and more like a tiny, self-contained website. If you look at the big trends in documents, it seems a plausible direction: (1) XML-in-ZIP packaging chunks the document into smaller resources which are then locatable by URL; (2) The old SGML desire for a separation of concerns between data and processing is now the dominant paradigm for documents encourages this fragmentation; (3) The old SGML one-bit-fat-tree approach has failed to withstand the need for more layering and fragmentation, and is being replaced (notably in DITA but also I see RDF fitting in here
too) with a system of smaller chunks of data linked together by hierarchical/tabular maps: any container that contains a container is liable to be moved to a map chunk of some kind. (4) VBA's security issues are leading to its demise, certainly on the Mac platform, and its downplaying on Windows by MS... (5) The possibility I raised two years ago seems to be coming closer to the mainstream: recently I read of a distro of Open Office which presents rendered PDF pages by default, and then switches to ODF if you want to edit the page: the ODF document is both PDF and ODF at the same time...
There is a flip-side: as documents become more like tiny websites, the desktop application will become more become more like a browser with an internal web-server. If you like MVC, maybe you could say that the document is the model., the browser provides the view, and the server and scripts provide the controllers..."
http://broadcast.oreilly.com/2009/07/documents-as-miniature-website.html
Jon Bosak, Co-Chair of the OASIS Universal Business Language Technical Committee, announced that "UBL 2.0 International Data Dictionary, Volume 1: Japanese, Italian, and Spanish" has been approved as an OASIS Committee Specification. Volume 1 provides informative Japanese, Italian, and Spanish translations of the roughly 2000 business terms normatively defined in English in the UBL 2.0 distribution, as updated by the Errata package released in May 2008.
From the Introduction: "UBL, the Universal Business Language, defines standard XML representations of common business documents such as purchase orders, invoices, and shipping notices. UBL 1.0, released as an OASIS Standard in November 2004, normatively defines over 600 standard business terms (represented as XML element names) that serve as the basis for eight basic standard XML business document types. These English-language names and their corresponding definitions constitute the UBL 1.0 data dictionary -- not a separate publication, but simply a label for the collection of all the element names and definitions contained in the UBL 1.0 data model spreadsheets and in the XML schemas generated from these data models. As an informational aid for UBL users, UBL localization subcommittees subsequently translated all of the UBL 1.0 definitions into Chinese (traditional and simplified), Japanese, Korean, Spanish, and Italian. These translations were published in a single merged spreadsheet called the UBL 1.0 International Data Dictionary (IDD)...
As the translation effort is expected to take some time, the UBL Technical Committee is releasing the UBL 2.0 IDD in stages as the localization subcommittees complete their work in order to begin the public review that is an integral part of the OASIS specification process. This first release, Volume 1, contains translations of all the UBL 2.0 data definitions into Japanese, Italian, and Spanish; subsequent releases will add further translations as they become available..."
http://docs.oasis-open.org/ubl/idd/cs-UBL-2.0-idd01/cs-UBL-2.0-idd01.html
See also the announcement: http://lists.oasis-open.org/archives/ubl-dev/200907/msg00132.html
A member submission made to W3C early in 2009 has been acknowledged by W3C with a team comment that the specification work "is not only innovative, but very well documented and demonstrated by the on-line XSPARQL demo XSPARQL is a fusion language, blending aspects of XQuery and SPARQL to enable users familiar with both XQuery and SPARQL to write queries which bridge the two systems." This submission was made jointly by several institutions, complete with ten signatories to Intellectual Property Statements supporting W3C's royalty-free IPR.
From the Abstract: "Recently, two new languages have entered the stage for processing XML and RDF data: XQuery is a W3C Recommendation since early last year and SPARQL has finally received W3C's Recommendation stamp in January 2008. While both languages operate in their own worlds (SPARQL in the RDF- and XQuery in the XML-world), we show in this specification that the merge of both in the novel language XSPARQL has the potential to finally bring XML and RDF closer together. XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL. As a side effect, XSPARQL may also be used for RDF to RDF transformations beyond the capabilities of "pure" SPARQL. We also describe an implementation of XSPARQL, available for user evaluation..."
"XSPARQL essentially combines the FROM and WHERE clauses from SPARQL with the XQuery FLOWR grammar, allowing direct manipulation of the XML Results of SPARQL queries. This allows XSPARQL to manipulate SPARQL results into e.g. XHTML, and integrate queries over RDF and XML sources.
XSPARQL can generate RDF, either as RDF/XML or as Turtle, enabling
(1) transformation from XML to RDF, and (2) additional functionality, e.g., aggregates, for SPARQL queries..."
http://www.w3.org/Submission/2009/01/Comment
See also the XSPARQL Submitted Materials: http://www.w3.org/Submission/2009/01/
"The Web's trajectory toward interactivity, which began with humble snippets of JavaScript used to validate HTML forms, has really started to accelerate of late. A new breed of Web applications is starting to emerge that sports increasingly interactive user interfaces based on direct manipulations of the browser DOM (document object model) via ever-increasing amounts of JavaScript. Google Wave, publicly demonstrated for the first time in May 2009 at the Google I/O Developer Conference in San Francisco, exemplifies this new style of Web application. Instead of being implemented as a sequence of individual HTML 'pages' rendered by the server, Wave might be described as a client/server application in which the client is a browser executing a JavaScript application, while the server is 'the cloud'...
We needed to supply many different implementations of UI functionality
-- version A for Firefox, version B for Safari, and so forth -- without burdening the compiled application with the union of all the variations, thereby forcing each browser to download at least some amount of irrelevant code. Our solution is a unique mechanism we dubbed deferred binding, which arranges for the GWT compiler to produce not one output script, but an arbitrary number of them, each optimized for a particular set of circumstances... our experience in developing GWT has thoroughly convinced us that there's no need to give in to the typical constraints of Web development..."
http://queue.acm.org/detail.cfm?id=1572457
About "Broad, useful, vocabularies with plenty of sample data...
digging into the vocabularies used in DBpedia's massive collection of RDF triples... how to chain statements together with shared resource references... note two DBpedia vocabularies being used not to query DBpedia, but to model data completely outside of the context of DBpedia, because they offered straighforward, dereferencable URIs for these things...
The Linked Movie Database project team has worked out a specific property vocabulary as part of their project, while the DBpedia one has grown more organically, leading to many more strange edge cases among the well-chosen terms... While the Library of Congress Subject Headings provide a solid, professional taxonomy and a set of URIs for a wide variety of subjects and concepts, they don't have them for places or people. (They might have one for London (England) -- History, but they don't have one for 'London (England)'.) So, while they have a URI for the concept of sightings of Elvis Presley since his death, they have no URI for Elvis himself. Nor do they have one for Einstein, and I don't know what well-known vocabulary does, so the RDFa spec's authors went with the DBpedia URI for the famous physicist.
To look for a property name you might need, you can check a DBpedia page for a resource that may have had that property assigned to it.
You can also download an ntriples or csv file in your choice of fourteen languages from DBpedia's Download Page.. , plenty of other hard work continues to make the DBpedia predicate vocabulary more valuable to all of us, so it's worth keeping an eye on the work going on around this vocabulary..."
DBpedia: "Knowledge bases are playing an increasingly important role in enhancing the intelligence of Web and enterprise search and in supporting information integration. Today, most knowledge bases cover only specific domains, are created by relatively small groups of knowledge engineers, and are very cost intensive to keep up-to-date as domains change. At the same time, Wikipedia has grown into one of the central knowledge sources of mankind, maintained by thousands of contributors. The DBpedia project leverages this gigantic source of knowledge by extracting structured information from Wikipedia and by making this information accessible on the Web under GNU Free Documentation License. The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies.
The knowledge base consists of 274 million pieces of information (RDF triples). It features labels and short abstracts for these things in 30 different languages; 609,000 links to images and 3,150,000 links to external web pages; 4,878,100 external links into other RDF datasets, 415,000 Wikipedia categories, and 75,000 YAGO categories..."
http://www.snee.com/bobdc.blog/2009/07/modeling-your-data-with-dbpedi.html
See also the DBpedia Knowledge Base: http://dbpedia.org/About
A W3C announcement "A Sprinkle of POWDER Fosters Trust on the Web" reports on the approval of three W3C Recommendations from the POWDER Working Group, which "takes steps toward building a Web of trust, and making it possible to discover relevant, quality content more efficiently." POWDER is a new W3C Standard intended to raise confidence in site quality, relevance and authenticity. The W3C POWDER (Protocol for Web Description Resources) Working Group was chartered to specify a protocol for publishing descriptions of (e.g. metadata about) Web resources using RDF, OWL, and HTTP.
"When content providers use POWDER, the Protocol for Web Description Resources, they help people with tasks such as seeking sound medical advice, looking for trustworthy retailers, or searching for content available under a particular license -- for instance, a Creative Commons license...
A site wishing to promote the mobile-friendliness of its content or applications can tell the world using POWDER. Content providers start by creating content that is conformant with W3C's mobileOK scheme and validating it with the mobileOK Checker. The checker generates POWDER statements that apply to individual pages. But a key feature of POWDER is that it lets content providers make statements about groups of resources -- typically all the pages, images and videos on a Web site.
Other tools such as the i-sieve POWDER generator (not from W3C) generates POWDER statements about the mobile-friendliness of entire sites. Once these POWDER statements are in place they can be used by search engines or other tools to help people find mobile-friendly content..."
http://www.w3.org/2009/09/powder-pr.html
See also the POWDER Working Group home page: http://www.w3.org/2007/powder/
"Conventional wisdom is that Web wanderers are safe as long as they avoid sites that serve up pornography, stock tips, games and the like. But according to recently gathered research from Boston-based IT security and control firm Sophos, sites we take for granted are not as secure as they appear.
Among the findings in Sophos' threat report for the first six months of this year, 23,500 new infected Web pages -- one every 3.6 seconds -- were detected each day during that period. That's four times worse than the same period last year, said Richard Wang, who manages the Boston lab. Many such infections were found on legitimate websites. In a recent interview with CSOonline, Wang outlined seven primary reasons legitimate sites are becoming more dangerous..."
http://www.networkworld.com/news/2009/090909-7-reasons-websites-are-no.html
W3C announced the publication of a Working Group Note on "Authoring
HTML: Handling Right-to-left Scripts." The document was produced by members of the Internationalization Core Working Group, part of the W3C Internationalization Activity.
The document provides advice for the use of HTML markup and CSS style sheets to create pages for languages that use right-to-left scripts, such as Arabic, Hebrew, Persian, Thaana, Urdu, etc. It explains how to create content in right-to-left scripts that builds on but goes beyond the Unicode bidirectional algorithm, as well as how to prepare content for localization into right-to-left scripts.
The specification is intended for all content authors working with HTML and CSS who are working with text in a language that uses a right-to-left script, or whose content will be localized to a language that uses a right-to-left script. The term 'author' is used in the sense of a person that creates content either directly or via a script or program that generates HTML documents.
It provides guidance for developers of HTML that enables support for international deployment. Enabling international deployment is the responsibility of all content authors, not just localization groups or vendors, and is relevant from the very start of development.
Ignoring the advice in this document, or relegating it to a later phase in the development process, will only add unnecessary costs and resource issues at a later date. It is assumed that readers of this document are proficient in developing HTML and XHTML pages..."
http://www.w3.org/TR/i18n-html-tech-bidi/
A nascent technology called WebGL for bringing hardware-accelerated 3D graphics to the Web is getting a lot closer to reality. Last week, programmers began building WebGL into Firefox's nightly builds, the developer versions used to test the latest updates to the open-source browser. Also this month, programmers began building WebGL into WebKit, the project that's used in both Apple's Safari and Google's Chrome.
Wolfire Games picked up on the WebKit move and offered a video of WebGL in action. Overall, the moves stand to accelerate the pace of WebGL development by making it easier to try out.
WebGL is one of a several efforts under way to make Web browsers into a more powerful computing platform, increasingly capable of rivaling what software running natively on a computer can do. Even the company with the most to lose from that direction (Microsoft) is embracing it with a Web-based version of Office... The WebGL plan emerged in March
2009 from Mozilla and the Khronos Group, which oversees the venerable OpenGL standard to let software tap into a computer's hardware-based graphics power . WebGL's roots lie with an earlier Mozilla project called Canvas 3D, a cousin of the present two-dimensional Canvas technology for drawing graphics in Web pages... Although Google is a WebGL supporter, it's also developing a higher-level 3D graphics technology called O3D for browsers..."
http://news.cnet.com/8301-30685_3-10357723-264.html
See also the Vladimir Vukicevic blog: http://blog.vlad1.com/2009/09/21/webgl-samples/
"A natural outgrowth of XML as a software-independent documentation environment that facilitates information reuse is the need to customize that information so that its content differs based on the specific audience or output format. This reuse is commonly known as single-source documentation, because a single set of input files can satisfy the requirements of multiple audiences or output formats. Some single- source requirements are handled automatically by the tools that produce output in different formats. For example, generating PDF output for a DocBook XML document that contains a link to external resources (using the 'ulink' element) embeds both a hyperlink to that information and its actual URL in that output, while generating HTML output from the same XML document simply embeds a link in that HTML output. Transforming a single element in different ways for different output formats is a step in the right direction for single-source documentation, but it doesn't enable customization of document content beyond its presentation requirements. Being able to customize the actual content of a document based on its target output format is a fairly common requirement for modern documentation. Luckily, this is easily handled by a combination of preprocessing and taking advantage of flexible aspects of the design of documentation formats such as DocBook XML.
The power and flexibility of XML, sets of existing standards, and a rich set of tools for working with and converting XML documents provide a powerful environment for creating and maintaining documentation. The attributes and techniques discussed in this article make it easy to create conditionalized documentation that can contain different content targeted toward specific audiences, computer systems, or presentation formats. If you add a simple preprocessing stage or set variables for use in your documentation-production process, you can create and maintain single-source documentation that produces specialized output.
http://www.ibm.com/developerworks/library/x-reuseinfo3/
See also the OASIS DocBook Publishers Schema public review: http://xml.coverpages.org/newsletter/news2009-07-03.html#cite2
On behalf the Ontolog UoM Panel Session Organizers (Ed Barkmeyer, Howard Mason, Frank Olken, Steve Ray) Peter Yim announced the creation of a new mailing list forum to support the community workspace of the Quantity and Unit of Measure Ontology-based Standard Initiative, now also online.
According to the 'Abstract and Thoughts' from the recent Ontology Summit
2009 Symposium, "Quantities and Units of Measure" was identified as a candidate ontology-based standard that folks from the standards community and the ontology community can (and should) work together on. Further momentum has been developing through the active discussion among the community members on the mailing list. Representatives include the acknowledged authorities who maintain governance over the system of measures. Primarily: BIPM, along with various national NMIs (National Measurement Institutes) who collectively maintain key documents such as the GUM (Guidelinefor Evaluating and Expressing the Uncertainty of NIST Measurement Results), the VIM (International Vocabulary of Basic and General Terms in Metrology), UCUM (Unified Code for Units of Measure), and the like. Other related organizations would be IEC, IFCC, ISO, IUPAC, IUPAP and OIML..."
http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1Y3Y
See also slides from the June 2009 Panel Session: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2009_06_19
W3C announced the publication of a final W3C TAG Finding, authored by members of the W3C Technical Architecture Group. "The Self-Describing Web" describes how document formats, markup conventions, attribute values, and other data formats can be designed to facilitate the deployment of self-describing, Web-grounded Web content. The Web is designed to support flexible exploration of information by human users and by automated agents. For such exploration to be productive, information published by many different sources and for a variety of purposes must be comprehensible to a wide range of Web client software, and to users of that software. HTTP and other Web technologies can be used to deploy resource representations that are self-describing:
information about the encodings used for each representation is provided explicitly within the representation. Starting with a URI, there is a standard algorithm that a user agent can apply to retrieve and interpret such representations. Furthermore, representations can be what we refer to as grounded in the Web, by ensuring that specifications required to interpret them are determined unambiguously based on the URI, and that explicit references connect the pertinent specifications to each other.
Web-grounding ensures that the specifications needed to interpret information on the Web can be identified unambiguously. When such self-describing, Web-grounded resources are linked together, the Web as a whole can support reliable, ad hoc discovery of information.
http://www.w3.org/2001/tag/doc/selfDescribingDocuments-2009-02-07.html
"I'm having a midlife digital identity crisis. Perhaps you share the same problem. Maybe we can work through it together. My digital personas and activities spread out over too many places. I don't know what community I belong to anymore. Surely, you know this problem. There's a sense of lost identity, being on AIM, Facebook, Flickr, FriendFeed, Ovi, Tumblr, Twitter, YouTube, Vimeo, Windows Live, Zune and many other services. I've abandoned some services and haven't yet become active on others. Companies and individuals have this sense that the digital lifestyle is larger than life -- that people can defy physical laws and be in two or more places at once. Where different digital services intersect/integrate, perhaps that's somewhat true. But if the connection is people, how many can you realistically be close to? How many places can you digitally be at once? Services like Facebook Connect or Google Friend Connect offer some remedy to online social networking silos.
But they're only a beginning, I say. I find some services falling into strange silos and others overlapping in weird ways. For example, 95 percent of my Windows Live Messenger contacts work for or are affiliated with Microsoft. AIM is a mishmash of everybody else, including eWEEK colleagues, family, friends, peers and people working for other high-tech companies. Most of my Twitter followers work for high-tech or PR firms.
So I do little to no personal tweeting. Why bother? [...] Can you be Fraker Attacker on Xbox, Frugal Juvenile on YouTube and Perry Pissant on USTREAM? That's all without accounting for specialized interests, like photography, role-play gaming, skate boarding or Yo-Yo collecting.
Those activities open up other digital and social activities. For example, a Nikon camera owner might join Nikonians or some other photo forum. Do you ever wonder about the mental health of having all these digital personalities? Doctors institutionalize people for split-personality disorders. Do you digitally know who you are?..."
http://www.microsoft-watch.com/content/digital_lifestyle/whats_your_digital_lifestyle.html
New search technology allows Google's search engine to identify associations and concepts related to a query. Google has given its Web search engine an injection of semantic technology, as the search leader pushes into what many consider the future of search on the Internet.
The new technology will allow Google's search engine to identify associations and concepts related to a query, improving the list of related search terms Google displays along with its results. Ori Allon, technical lead of Google's Search Quality team, said in an interview Tuesday that the search improvement involves a dollop of semantic search technology mixed in with a big helping of lightning-fast, on-the-fly data mining: "This is a new approach to query refinement because we're finding concepts and entities related to queries while you do a search, so everything is happening in real time and not [pre-assembled]," he said. "Because we're doing it in real time, we're able to target many more queries. The use of semantic search isn't more broad at this point because full conceptual analysis of documents would slow down the process of generating query refinements on the fly.
If we want to get it all done in a matter of milliseconds, there's a lot of innovations we still have to do. A full semantic search would be very hard to do in this limited amount of time." Offering query-refinement suggestions is but the first application of the technology behind these enhancements, so users can expect other concrete improvements applied to things like search ranking. Google has often been criticized for using what is considered an aging approach to solving search queries based primarily on analyzing keywords and not on understanding their meaning. There is an entire field of Google competitors that are busy developing and perfecting semantic search engines, betting that they will be able to deliver on the promise of this technology: to let users type in queries in natural language and have the search engine understand their meaning and intent... Microsoft last year acquired Powerset, one of these companies, in order to improve its Web search engine with semantic search technology. Google also rolled out on Tuesday another enhancement to its search engine: longer "snippets," which are the text excerpts Google extracts from Web sites to show in search results where the query keywords appear.
http://www.infoworld.com/article/09/03/24/Google_rolls_out_semantic_search_capabilities_1.html
See also the Ori Allon Google blog: http://googleblog.blogspot.com/2009/03/two-new-improvements-to-google-results.html
This article discusses the use of fuzzy description logics and patterns to automatically determine the sacred figure depicted in an icon. As the amount of the Web's cultural content grows, search and retrieval procedures for that content become increasingly difficult. Moreover, Web users need more efficient ways to access huge amounts of content.
So, researchers have proposed sophisticated browsing and viewing technologies, raising the need for detailed metadata that effectively describes the cultural content. Several annotation standards have been developed and implemented, and Semantic Web technologies provide a solution for the semantic description of collections on the Web.
Unfortunately, the semantic annotation of cultural content is time consuming and expensive, making it one of the main difficulties of cultural-content publication. Therefore, the need for automatic or semi- automatic analysis and classification of cultural assets has emerged...
Some cultural domains are appropriate for automatic analysis and classification methods. Byzantine icon art is one of them. The predefined image content and the low variability of the image characteristics support the successful application of image analysis method... Byzantine iconography follows a unique convention of painting. The artistic language of Byzantine painters is characterized by apparent simplicity, overemphasized flatness, unreal and symbolic colors, lack of perspective, and strange proportions. The sacred figures are set beyond real time and real space through the use of gold backgrounds. From Art Manual to Semantic Representation: Although the knowledge in Dionysios's manual concerns vague concepts such as 'long hair,' 'young face,' and so on, it's quite strict and formally described. Consequently, we can create an ontological representation of this knowledge using OWL. In this way, the ontology's axiomatic skeleton will provide the terminology and restrictions for Byzantine icons... The 'Knowledge representation and reasoning' subsystem consists of terminological and assertional knowledge and a reasoning engine. These types of knowledge are the basic components of a knowledge-based system based on DLs, a structured knowledge- representation formalism with decidable-reasoning algorithms. DLs have become popular, especially because of their use in the Semantic Web (as in OWL DL, for example). DLs represent a domain's important notions as concept and role descriptions. To do this, DLs use a set of concept and role constructors on the basic elements of a domain-specific alphabet.
This alphabet consists of a set of individuals (objects) constituting the domain, a set of atomic concepts describing the individuals, and a set of atomic roles that relate the individuals. The concept and role constructors that are used indicate the expressive power and the name of the specific DL. Here, we use SHIN, an expressive subset of OWL DL that employs concept negation, intersection, and union; existential and universal quantifiers; transitive and inverse roles; role hierarchy; and number restrictions. Results: We evaluated our system on a database, provided by the Mount Sinai Foundation in Greece, containing 2,000 digitized Byzantine icons dating back to the 13th century. The icons depict 50 different characters; according to Dionysios, each character has specific facial features that makes him or her distinguishable.
Evaluation of the Byzantine-icon-analysis subsystem produced promising results. The subsystem's mean response time was approximately 15 seconds on a typical PC. In the semantic-segmentation module, the face detection submodule reached 80 percent accuracy. In most cases, the failure occurred in icons with a destroyed face area...
http://www.computer.org/portal/cms_docs_intelligent/intelligent/homepage/2009/x2tzo.pdf
See also the Web Ontology Language (OWL): http://www.w3.org/2004/OWL/
Google has released software called O3D to bring accelerated 3D graphics to browsers, a significant effort but not the only one to try to endow Web applications with some of the computing muscle that PC programs can use... O3D is a browser plug-in for Internet Explorer, Firefox, Safari, and Chrome that works on Windows, Mac OS X, and Linux, but Google hopes that eventually, the technology will be built directly into browsers. It provides an interface that lets developers' Web-based JavaScript programs tap directly into a computer's graphics chip, which could mean better games and other applications... Firefox backer Mozilla and the Khronos Group, which oversees the widely used OpenGL 3D interface standard, announced their own effort to build a 3D Web interface. The two efforts, while tackling the same basic idea, use different approach... From the Google Code blog: "Most content on the web today is in 2D, but a lot of information is more fun and useful in 3D. Projects like Google Earth and SketchUp demonstrate our passion and commitment to enabling users to create and interact with 3D content.
We'd like to see the web offering the same type of 3D experiences that can be found on the desktop. That's why, a few weeks ago, we announced our plans to contribute our technology and web development expertise to the discussions about 3D for the web within Khronos and the broader developer community. Today, we're making our first contribution to this effort by sharing the plugin implementation of O3D: a new, shader-based, low-level graphics API for creating interactive 3D applications in a web browser. When we started working on O3D, we focused on creating a modern 3D API that's optimized for the web. We wanted to build an API that runs on multiple operating systems and browsers, performs well in JavaScript, and offers the capabilities developers need to create a diverse set of rich applications. O3D is still in an early stage, but we're making it available now to help inform the public discussion about 3D graphics in the browser. We've also created a forum to enable developers to submit suggestions on features and functionality they desire from a 3D API for the web..."
http://news.cnet.com/8301-17939_109-10224078-2.html
See also the blog article: http://google-code-updates.blogspot.com/2009/04/toward-open-web-standard-for-3d.html
Yahoo has offered a peek at how its search results are likely to be displayed a few months from now, as it tries to find a better alternative to the traditional "10 blue links." "People don't really want to search," said Prabhakar Raghavan, head of Yahoo Labs and Yahoo's search strategy, in a [recent] meeting with reporters in San Francisco.
Their objective is to quickly uncover the information they are looking for, not to scroll through a list of links to Web pages. Yahoo's answer is to try to figure out the "intent" of the person conducting the search, and then present various types of information within the results that relate to what they are looking for, such as restaurant reviews, movie times, flight schedules and so on. Yahoo showed a slightly different page layout for displaying search results that it's currently testing with users. Search results for the name of a restaurant lead off with a map showing its location, followed by links to an aggregated selection of reviews, photos and directions. Yahoo is revamping its image search in a similar way. Moving away from the "blue links" is something all the main search companies have been exploring. Even Google, which dominates Web search and has the least to gain from disrupting the status quo, has been blending news, video and other content with its results.
Microsoft CEO Steve Ballmer has admitted how tough it is to beat Google at its own game, and suggested that the only way to win market share in search is to change the playing field and do things differently... Part of the challenge is figuring out the user's intent. "You cater to the user's intent as best you can define it," he said. For example, there are many towns in the world called Syracuse, but if a person is searching for "Syracuse restaurant" and it's 6 p.m. Eastern Time, there's a good chance they are in New York because that's where it's time for dinner. The other challenge is creating the web of objects.
Yahoo plans to do it with software algorithms but also using "the wisdom of crowds." Specifically, it will use data provided through its SearchMonkey project, which encourages site owners to provide structured data about the content on their Web sites. Still, building the web of objects is a long-term effort and will apply to only a fraction of search queries to begin with. "This is going to take years" to complete, Raghavan said.
http://www.pcworld.com/businesscenter/article/165214/yahoo_vows_death_to_the_10_blue_links.html
Case-sensitive passwords are common on the Internet. And now, perhaps, the global community of Internet developers might have to prepare for a case-sensitive semantic Web. The lowercase 's' semantic Web that might offer users a far richer Internet experience differs dramatically from the Semantic Web -- a specific framework defined by the W3C -- that has hovered tantalizingly just out of mass reach for almost a decade.
Researchers are exploring several different approaches to providing an alternate semantic experience, from domain-specific academic research engines to commercial offerings from established companies such as Google and startups such as Kosmix... Numerous academic and commercial researchers are exploring ways to access and index the contents in the deep Web -- mainly content hidden behind HTML forms in databases -- in order to offer users more information to answer their queries. Currently, much of the academic research centers on domain-specific aspects of the deep Web because academic funding is inadequate to tackle form discovery and indexing over the entire Web. The communities might approach the issue from slightly different vectors, but there is near consensus that the crux of delivering richer material to the Internet user is developing a way to access deep Web and surface Web pages and provide some sort of semantically aware architecture to constrain query results. [Researchers] Geller and Chun wish to go beyond indexing-form labels and form-field values of the deep Web: "The DeepPeep search engine looks for domain-specific forms that may lead users to desired deep Web contents.
Our initial approach to extracting Web form labels, to use them as index terms, is similar to their approach reported in VLDB (Very Large Data Base) 2008. However, what we advocate is to annotate the forms in a way such that even the generic search engines such as Google and Yahoo can locate the deep Web forms. This requires not only the labels used in the forms to be indexed, which seems to be the predominant method used in DeepPeep, but that the semantic contents of the deep Web also be available for search..." Significant advances in semantic enrichment -- in both the deep and surface Web -- will face different hurdles in different settings.
Although academic researchers might be hampered by the inability to create an infrastructure scalable enough to attract large numbers of users, commercial entities such as the large search engines might find it difficult to alter their revenue-producing platform architectures to accommodate nascent semantic technologies, even technologies that don't rely on orthodox Semantic Web elements such as the Resource Description Framework and Web Ontology Language...
http://www2.computer.org/cms/Computer.org/ComputingNow/homepage/2009/0509/rW_IC_semanticWeb.pdf
"Tales from the encrypt: If you care about the integrity of your data, it's time to investigate solutions for accessing and securing it -- and not just for the here and now... Like an increasing number of people who care about the security and integrity of their data, I have encrypted all my hard-drives -- the ones in my laptops and the backup drives, using 128-bit AES -- the Advanced Encryption Standard. Without the passphrase that unlocks my key, the data on those drives is unrecoverable, barring major, seismic advances in quantum computing, or a fundamental revolution in computing. Once your data is cryptographically secured, all the computers on earth, working in unison, could not recover it on anything less than a geological timescale...
But what if I were killed or incapacitated before I managed to hand the passphrase over to an executor or solicitor who could use them to unlock all this stuff that will be critical to winding down my affairs
-- or keeping them going, in the event that I'm incapacitated? I don't want to simply hand the passphrase over to my wife, or my lawyer. Partly that's because the secrecy of a passphrase known only to one person and never written down is vastly superior to the secrecy of a passphrase that has been written down and stored in more than one place. Further, many countries's laws make it difficult or impossible for a court to order you to turn over your keys; once the passphrase is known by a third party, its security from legal attack is greatly undermined, as the law generally protects your knowledge of someone else's keys to a lesser extent than it protects your own..."
http://www.guardian.co.uk/technology/2009/jun/30/data-protection-internet
"New technologies that show the economy of using general-purpose hardware for high-volume HTTPS traffic... It is estimated that the Internet connects 625 million hosts. Every second, vast amounts of information are exchanged amongst these millions of computers. These data contain public and private information, which is often confidential and needs to be protected. Security protocols for safeguarding information are routinely used in banking and e-commerce. Private information, however, has not been protected on the Internet in general. Examples of private information (beyond banking and e-commerce data) include personal email, instant messages, presence, location, streamed video, search queries, and interactions on a wide variety of on-line social networks.
The reason for this neglect is primarily economic. Security protocols rely on cryptography, and as such are compute-resource-intensive. As a result, securing private information requires that an on-line service provider invest heavily in computation resources. In this article we present new technologies that can reduce the cost of on-line secure communications, thus making it a viable option for a large number of services.
The motivation behind our research is primarily to enable widespread use of, and access to, HTTPS. It is important for service providers and users to be able to trust each other for their mutual benefit. An important aspect of the trust comes from knowing that private communications are kept confidential and adhere to the policies established between providers and users.
In summary: we are researching new technologies that offer cryptographic algorithm acceleration by factors. Our ultimate goal is to make general-purpose processors capable of processing and forwarding encrypted traffic at very high speeds so that the Internet can be gradually transformed into a completely secure information delivery infrastructure. We also believe that these technologies can benefit other usage models, such as disk encryption and storage..."
http://www.ddj.com/security/218102294
Even after 20 years, the Web continues to redefine itself. The Internet is transforming from a hypertext document system to something that resembles a full-blown operating system. In this article, focus on a critical functionality missing in the emerging cloud-based operating
system: The existence of a standards-based Web clipboard. In this article, you discover what a Web clipboard might look like using AtomPub and the AtomClip XUL Firefox extension... My goals for AtomClip are
modest: implement a Web clipboard using open standards and existing technologies to clip simple content and provide a foundation for more sophisticated data types and objects. I use a Firefox extension as the client application and an Atom XML feed, provided by eXist XML database, as Web storage... A Web clipboard using open standards and technologies that are currently deployed has good adoption characteristics. For a Web clipboard to provide more sophisticated functionality, a simple scenario must first be addressed. With a combination of XUL, Atom XML feeds, and AtomPub, you have a powerful set of technologies based on what is popular on the Web right now.
http://www.ibm.com/developerworks/xml/library/x-cldexp/
See also Atom references: http://xml.coverpages.org/atom.html
Dion Hinchcliffe has recently offered two related articles that explore relationships between Web Oriented Architecture (WOA) and other technologies. The first article deals with WOA and REST; the second looks at WOA and SOA. The main point of the first article: REST is a style and WOA is the architecture. The second article argues that WOA is really a highly complimentary sub-style of SOA and explores the implications of this simple observation... Hinchcliffe defines WOA in two parts: a core that contains REST, URLs, SSL, and XML; and, a "WOA Full" that includes protocols and interfaces (e.g. BitTorrent), identity and security (e.g. OpenID), distribution and components (e.g.
Open APIs), and data formats and descriptions (e.g. ATOM). These are organized in a WOA Stack in six levels with (example technologies):
[1] Distribution - HTTP, feeds; [2] Composition - Hypermedia, Mashups; [3] Security - OpenID, SSL; [4] Data Portability - XML, RDF; [5] Data Representation - ATOM, JSON; [6] Transfer Methods - REST, HTTP. This stack reinforces the relationship between WOA and REST, with the latter being fundamental to and supportive of the larger architectural idea.
http://www.infoq.com/news/2009/06/hinchcliffe-REST-WOA
See also SOA Considers Web-Oriented Architecture (WOA) in Earnest: http://hinchcliffe.org/archive/2008/09/08/16676.aspx
Nokia has introduced a set of developer tools that seeks to make it easier for Web developers to create widgets for mobile phones: Web Runtime plug-ins for Microsoft's Visual Studio, Adobe's Dreamweaver, and Aptana Studio, enabling third parties to create mobile widgets using HTML, CSS, JavaScript, Ajax, and other standard Web development code.
The company said this should streamline the development process, as well as attract a new set of content creators who may not have mobile experience.
According to the text of the announcement: "Nokia WRT widgets provide mobile users instant access to customizable information or tools drawn in real-time from the Internet. Popular widgets range from breaking news headlines to stock-market tickers, social network status updates, flight arrival schedules, localized daily weather and more. Nokia's WRT is built on the same open-source, industry standard WebKit project environment used by Web Browser for S60, Nokia's full-HTML browser for the latest devices running on Nokia's S60 platform. The world's leading smartphone platform, S60 on Symbian OS accounts for more than 180 million devices cumulatively shipped by S60 licensees, including Nokia, Samsung, Lenovo and LG."
http://www.informationweek.com/news/internet/webdev/showArticle.jhtml?articleID=217800324
See also the Nokia announcement: http://www.nokia.com/A4136001?newsid=1321415
Twitter is undoubtedly one of the most recent and successful examples of social networking to appear on the World Wide Web. Twitter provides an API so Web developers can enable their users to access the various features that the Twitter site provides. In this article, learn the basics of using the Twitter REST API... Twitter is a fabulous entry in the Web 2.0 genre. Using Twitter, you can microblog your way to building an entire online network of individuals who share common interests with you. Using the Twitter REST API, you can automate just about everything you can do with Twitter manually. You can programatically access a specific user's timeline. You can reply to that user, either directly or indirectly. You can search a user's tweets for information specific to your own interests. You can filter tweets based on certain criteria and display those tweets on your own blog. The possibilities are endless...
http://www.ibm.com/developerworks/xml/library/x-twitterREST/
Opera 10 Alpha: Services for Sharing Files, Music and Photos, Hosting Gregg Keizer, ComputerWorld
In a bid to boost its share of the browser market, Opera Software has unveiled an alpha build of Opera Unite, a technology platform that adds a compact Web server to its browser and lets users share files, photos and music without using third-party services. "We are enabling every single computer to be a two-way street on the Internet," CEO Jon von Tetzchner said in a Webcast the company held early this morning to introduce the early version of Unite. The collaborative technology has been embedded in Opera 10, the still-in-beta browser that the company will release in final form alongside that application...
According to the Opera Unite web site: "Opera Unite allows you to easily share your data: photos, music, notes and other files. You can even run chat rooms and host entire Web sites with Opera Unite. It puts the power of a Web server in your browser, giving you greater privacy and flexibility than other online services. What if you use Opera at home, and a different Web browser at work? Opera Unite services can be accessed from any modern browser, including mobile browsers. At home, just select what you want to share, and you can view it later using your work Web browser without any problems...
Simply enable Opera Unite when you start Opera, and you are ready to go. Find and install services with one click from our online catalog or easily create your own by using Web standards like HTML, CSS, JavaScript, SVG and AJAX..."
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9134423
See also the Introduction to Opera Unite: http://dev.opera.com/articles/view/an-introduction-to-opera-unite/
"What do data archivists have in common with monarch butterflies, salmon, and most geese? They are always preparing for their next migration. From magnetic tape through hard disks, floppy disks, CDs, DVDs, and Blu-ray, digital data formats change periodically, forcing archivists to migrate their data before obsolescence or end-of-lifetime sets in. But now Tadahiro Kuroda, a research engineer and professor at Keio University, in Yokohama, Japan, and his team have come up with a device that could put an end to this recurring -- not to mention costly -- upheaval in data preservation by maintaining the data safely in an unchanging format that's predicted to last 1000 years... The device is a permanent memory system based on semiconductor technology. The prototype consists of four stacked 300-millimeter silicon wafers incorporating 2.5 terabits, (320 gigabytes), of data encoded on read-only memory and fabricated using a 45-nanometer complementary metal-oxide-semiconductor (CMOS) process, together with a separate data reader... The "Digital Rosetta Stone" would store data and give wireless access to it for 1000 years..."
http://www.spectrum.ieee.org/semiconductors/memory/digital-data-written-in-stone
"OAuth is an open protocol that lets users share their protected resources among different Web sites, without risking exposure of users' credentials. Part 1 of this series introduced OAuth and showed you how to develop an OAuth-enabled desktop Twitter client. In Part 2, you learned how to develop an OAuth-enabled Web Twitter client. In this final part of the series, we deploy the Web application developed in Part 2 to the Google App Engine (GAE).
CloseGAE, which is provided by Google, enables Web applications to run on Google's infrastructure. A big benefit of GAE is that your applications can easily scale as your traffic and data storage needs grow. You can focus on software development, without worrying about Web and database server maintenance. With a reasonable amount of traffic to your applications deployed on GAE, you can use it for free. When more and more users are attracted to your site, you can buy more CPU time and data storage from Google. As of the writing of this article, GAE supports both Python and Java code...
OAuth provides a better way for a consumer site to access a user's protected resources held on a service provider. With OAuth, credentials are never exposed to sites other than where the user's data is originally held..."
"The OAuth 1.0 Protocol," recently published by IETF as (Informational) Request for Comments 5849, "provides a method for clients to access server resources on behalf of a resource owner (such as a different client or an end-user). It also provides a process for end-users to authorize third-party access to their server resources without sharing their credentials (typically, a username and password pair), using user-agent redirections."
http://www.ibm.com/developerworks/web/library/wa-oauth3/
See also The OAuth 1.0 Protocol published as RFC 5849: http://tools.ietf.org/html/rfc5849