I enjoyed this (my third) conference experience most of all. Over the course of three years I’ve come to know more folks in the community and that makes a big difference. Another big win this year was bringing alo ng a colleague to help deliver the talk. One small but significant treat is visiting a talk and getting more out of than you expected. Paul Gearon and James Leigh gave a talk called “merging RDF stores” which was really a summary of layout methodologies for triple stores. They covered the common techniques and pointed to the trade offs and advantages of each, depeding on the characteristics of your dataset. Paul gave me an interesting explanation of how he solves the minimum spanning tree problem, which is a very useful query to make which cannot be handled with basic SPARQL.
I happened to run into Harold Carr. We were sitting next to each other and I asked if anyone had been to Jazoon, and he chimed in “I’ll be there next week”…then after reading his badge I remembered I had read his blog at java.net. Seems as though Harold is the only other semweb enthusiast as Sun besides Henry Story (who was conspicuously absent). The conference keynotes were better, however, they were mostly very “NLP” based. Tools like WolframAlpha, bing.com, and Siri were highlighted. Seems like the idea of a “task oriented” user interface is gaining traction and that seems to necessarily include understanding “sentences”. There was also another aspect to the conference that stood out, RDFa. RDFa is now being indexed by Yahoo’s search Monkey and Google as well. Mark Birbeck demo’d how he helped integrate UK ministry job openings into one index by having each individual website use common RDFa markup within their job posting pages. Talis was also in the house, among them, Leigh Dodds who I recognized from his blog photo. I also got to meet Paul “lucky charms” Miller who blogs at ZDNet.
During my “jena community of practice” talk I presented on jenabean, geosparql, and our work at Travelocity on Asydeo. Thankfully we had Dave Renolds from HP labs to give some update on Jena. Feels like the semantic web is beginning to take shape. best buy is emitting RDFa, New York Times announced a linked data service, and the BBC is using RDFa as well (BackstageBBC).
I was having a conversation with a respected colleague yesterday and and we began discussing publications. His comment was that he stopped reading Dr. Dobb’s because it was full of complicated algorithms he’d never use. That struck a nerve within me. Although a side of me respects the drive towards simplicity, some things are necessarily complex. It seems that the computing field is becoming less tolerant of thinking hard about difficult problems. Early in my career we would ask candidates to pseudo code search algorithms or discuss the running time of a common sort using asymptotic notation. Now we primarly only want to know if they understand TDD and have used Grails before. Just last weekend at a regional conference I heard a thought provoking talk from an industry veteran about usability. He was soft spoken, however, the content he delivered was solid and interesting. Afterward I heard some complain that he wasn’t animated enough. Later in the day I heard a “social media” expert give a talk on twitter. It was interesting and funny and yet there was nothing profound or substantive in his talk. It was momentarily entertaining, but not filling, like soda and candy bar.
Just yesterday I heard that HP is shutting down their non-US based labs groups. So it seems that the “smarties” are not earning their keep. I believe there is a slow dumbing down of the technology profession. Some are concerned about dropping enrollment in CS, but I’d counter that it’s a sign of intelligence. The wise young person can probably perceive that intelligence has more leverage in other venues.
I attended The Big (D)esign Conference yesterday. I was impressed with how well it was organized and the implications for furture technology conferences. It was organized by a team of volunteers as a joint venture by the Dallas/Fort Worth Usability Professionals’ Association, Refresh Dallas, and the Dallas/Fort Worth Interaction Design Association. During the conference I was able to see some top notch speakers, and the venue was small enough to ensure you were able to get a chance to speak face to face with nearly any one of them. It cost $50 and a short trip to SMU. Constrast that to the 1,000+ and flight it normaly requires to get a line of similar caliber.
It was held at the student center at SMU. I liked the venue because it was small, and yet still had plenty of conference rooms to allow 4 simultaneous tracks. The sponsors setup booths at the center…so there was no partitioned off exhibit floor, they were front and center.
The first semantic web meetup was held May 20th, at the co-working office of http://www.companydallas.com/. As I expected it was a motley crew. The “semantic web” space consists of RDF/OWL interests, natural language processing, and mathmatical techniques such as LSA (latent semantic analyses)…it’s really a diverse set of interests and I sometimes wonder what joins them all together. I gave a talk on geosparql, a tool build on google’s new app engine for java. My favorite talk was from 80legs.com. They have a new cloud compute model that targets when crawling. Somebody asked “what does that have to do with the semantic web?”. I was thinking to myself…everything! The semantic web requires that one be able to scrape here and there, or at least I see it that way. Swingly was also introduced, an index of questions and answers found on the internet…again, very interesting. PureDiscovery came out to set us straight and remind us that ontology creating will never work, why do we keep trying?
The JDK is full of gems. My latest discovery began with a need for a simple in memory “least recently used” (LRU) cache. I wanted a hash map with limited size. When the cache is full, it should remove the least recently used item, keeping the more popular items in memory. I figured there was probably an existing implementation in java.util, but nothing fit, until I ran across some posts that mentioned “LinkedHashMap” as a possible solution…Bingo. The java doc even mentions this type of application and provides an abstraction ready for implementation when you extend LinkedHashMap,
removeEldestEntry()
Today I found a good example with documentation, for anyone who is curious, or would like an off the shelf ready to use example of LRU cache for java, see
The googlecode wiki system allows you to insert google gadgets into your pages, as well as the front page of your open source project. Do this by using using the wiki:gadget tag:
<wiki:gadget width="450" height="365" border="0" url="http://link/to/mygadget.xml" title="title of your gadget" />
That’s the easy part…the xml file describes the content you’d like google to “iframe” into you wiki page.
<Module>
<ModulePrefs title="Developer Forum" />
<Content type="html">
<![CDATA[
your stuff here
]]>
</Content>
</Module>
The simplest thing to do is host this file within your SVN repo. To see and example here’s my xml file that includes and embed for a related screen toaster demo. I merely took the HTML code suggested on screentoaster and placed it in the gadget module descriptor.
It’s not super simple but possible to truncate your google apps datastore. Please, don’t do as I did and attempt to use the web GUI to delete page after page of data for anything over a few hundred Entities. Instead, use this simple servlet example after changing “myKindOfData” to your own type. It’d even be better if you spend a few minutes and improve the servlet so that it takes a parameter to supply for the “kind” of entity.
public class TruncateServlet extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query q = new Query("myKindOfData");
ArrayList keys = new ArrayList();
for (Entity taskEntity : datastore.prepare(q).asIterable())
keys.add(taskEntity.getKey());
datastore.delete(keys);
}
}
What I learned doing this activity is that when you’re deleting many rows, you’ll want to first collect the keys, then supply the collection to the delete() method, as opposed to deleting in the query loop, which will time out on you. (GAE is very picky about how long web requests take). I like how GAE applies all these restrictions on us…it forces you to economize and avoid long running servlet requests. Don’t leave this servlet on an open URL for too long, anybody can come along and delete your data.
This post describes how to make a maven repo for your google code project. I’m a big fan of google code. It’s a tremendous service for those looking for a place to host their open source code. The only thing I lacked was a maven repo. I have to be honest here, while I use maven, it’s still a big mystery to me how things end up in the main maven repo. If Google were to add one new feature, I’d prefer it be a maven repo and hudson build that autodiscovers maven projects and builds them…but we’re not there yet, so the question I’m going to answer is how you can provide a maven repo for your open source customers hosted on google code. Read the rest of this entry »
Recently on the Jenanews group there was a question regarding classpath and how frustrating it can be to properly configure that aspect of a new project. I began to answer the question and realized I haven’t touched a classpath for years simply because the tools I use make that unnecessary. Eclipse is free and has very good maven integration. At the same time, the Jena team is providing jena as a Maven asset indexed on the main maven repo. The consequences are that you can have eclipse create a new project for you, and add your library dependencies for you by simply declaring that your project “uses” jena. Here is a quick screentoaster demo to get you going…
(this post deals with features provided jenabean 1.0.1, available from the project site. You’ll also need Jena, HP’s semenatic web framework.)
In my last post we looked at reading SIOC directly off the web. The other side of the coin is producing syntactically valid SIOC statements from java. You may want to create RDF for another consumer or perhaps you want to persist the SIOC statements directly into a Jena model. Either way, if you use the direct approach of coding to Jena’s RDF api, you’ll be writing quite a few lines of code. This task can be made simpler using Jenabean’s “Thing” utility along with a specialized interface to the Sioc vocabulary. We’ll be looking at this simple example, which duplicates the primary RDF example givin in the SIOC specification document. Read the rest of this entry »