Using Neo4j Graph Databases With ColdFusion

After last week, I decided to put off picking a new frontend platform for my Semantic Web rubric project and focus a bit on the server backend.

Since this is just a proof-of-concept project at this point I can afford to take some risks in choosing technologies. I’ve been following the developments around using graph databases for storing data, especially for Semantic Web applications. One project that kept coming up was Neo4j, a graph database engine built in Java. I figured now was a good time to try it out. My server-side logic is built in ColdFusion, and integrating open source Java projects like Neo4j into CF applications is generally a snap.

Aside from one hiccup, porting Neo4j’s 1-minute Java “Hello World” example to CFML proved to be fairly straightforward. The process I used to get this working is detailed below. I’d suggest that you skim over the Java example before continuing – I’m sure I left out some of the exposition.

First add the Neo4j Jar files to the ColdFusion server:

  • Download the Neo4j “Apoc” distribution and unpack it somewhere convenient. I’m using Mac OS X, so I put things like this in ~/lib/neo4j-apoc-1.0
  • Add the Neo4j JAR files to the ColdFusion classpath. Log into your ColdFusion Administrator, and select Server Settings -> Java and JVM. Enter the path to the lib folder in your Neo4j distribution in ColdFusion Class Path
  • Restart your ColdFusion server. If you’re at all nervous, log back in to the ColdFusion Administrator and verify that the Neo4j jars are indeed listed on your classpath.

Once this is complete, you can initialize a new database for your ColdFusion app. Decide where you want the CF server to create the Neo4j data files and pass that to the object’s init() method. I put mine in a folder under /tmp on Mac OS X.

<cfset dbroot = "/tmp/neo4jtest/" />

<cfset graphDb = createObject('java',
                  "org.neo4j.kernel.EmbeddedGraphDatabase") />
<cfset graphDb.init(dbroot & "var/graphdb") />

[Aside for non-ColdFusion folks: CF doesn't instantiate Java objects quite how you'd expect. The call to CreateObject() just gets a handle on the class itself. Calling init() on the resulting handle actually instantiates the class via the appropriate constructor.]

Just as in the Java example, it’s good to surround your connection with a try/catch block that will close your database connection if you throw an error. As I was working with Neo4j I would periodically lock up my database and not be able to connect without restarting CF. Adding a CFTRY/CFCATCH block cleared this right up.

<cftry>
   <cfset tx = graphDb.beginTx() />

   <cfscript>
     tx.success();
     WriteOutput("Success.");
   </cfscript>

   <cfset tx.finish() />

  <cfcatch type="any">
     <cfset graphDb.shutdown() />
     <cfdump var="#cfcatch#">
   </cfcatch>
</cftry>

<cfset graphDb.shutdown() />

Where things got really sticky was the use of Java enumerations to declare the available relationship types for the graph:

 /* Java code */
 public enum  MyRelationshipTypes implements RelationshipType
 {
    KNOWS
 }

To my knowledge there’s no way to declare something like this in standard CFML. I likely could have wrapped this in a Java class of some sort and loaded it through CreateObject(), but that wouldn’t have been true to the spirit of ColdFusion. So I dug around in the Neo4j docs and found an answer: relationships can be created dynamically at runtime from a static method on the class org.neo4j.graphdb.DynamicRelationshipType. I created an instance of DynamicRelationshipType for the “KNOWS” relationship and loaded it into a Struct, anticipating caching them in Application scope for a real application.

 relationship = CreateObject("java",
                             "org.neo4j.graphdb.DynamicRelationshipType");
 MyRelationshipTypes = structNew();
 MyRelationshipTypes.KNOWS = relationship.withName( "KNOWS" );

It might be interesting to see if these relationship enumerations could be generated and compiled by something like JavaLoader. I’m not yet aware of any downsides with dynamic relationships besides the obvious lack of compile-time checking.

The rest of the exercise follows without any real suprises:

 firstNode = graphDb.createNode();
 secondNode = graphDb.createNode();
 relationship = firstNode.createRelationshipTo( secondNode,
                                         MyRelationshipTypes.KNOWS );

 firstNode.setProperty( "message", "Hello, " );
 secondNode.setProperty( "message", "world!" );
 relationship.setProperty( "message", "brave Neo4j " );

 WriteOutput( firstNode.getProperty( "message" ) );
 WriteOutput( relationship.getProperty( "message" ) );
 WriteOutput( secondNode.getProperty( "message" ) );

And there you have it! A quick and dirty Neo4j application built with CFML.

I’ve put a little work into developing a Neo4j helper class that hides some of these warts in a nice clean CFC. As soon as I can get eGit to behave I’ll post the files on GitHub.

Speaking tonight on the Semantic Web

The Semantic Web has been a strong interest of mine over the last two years. When I came across RDF and OWL through a research project at IST back in 2008, a Web Standard no less, I’d somehow been completely oblivious to its existence.

If you’ve never heard of the Semantic Web, here’s a quick intro video. I’ll wait here.

Everybody back? Okay! The concepts behind OWL seemed to solve a few thorny design issues I’d come across in a decade of building relational databases-backed Web 1.0 apps, and do so in a really elegant way. Working with OWL fuses aspects of relational database modeling, information architecture, and object oriented design into a new set of technologies and techniques.

As I started talking to members of the developer community at Penn State about the Semantic Web, I got a lot of blank stares and misunderstandings (“Isn’t that just XML?”). And yet, every graduate student in IST was exposed to ontologies and semantic modelling as a routine part of the curriculum. The research community had been working with ontologies for years. Clearly there was a large academic-practitioner gap here to be bridged.

So as I’ve done many times in the past with a new technology or concept, I started talking about the Semantic Web at user group meetings and conferences, and looking for ways to apply these technologies in low-risk venues.

Tonight is the latest in this series of speaking engagements, and possibly the most challenging thus far. I’ll be presenting my talk “An Argument For Semantics” at the Portland Java User Group. I’ve been really impressed by the quality of home grown presenters at PJUG since I started attending. My talk will be very different – less code, more conceptual – than usual PJUG speakers, but I’m hoping the technical experience in the room can generate a good discussion on how and when it makes sense to employ Semantic Web technologies in real world applications.

Heading to CodeMash ’08

I’m really looking forward to CodeMash. The slate of speakers and topics looks fantastic; It’s really nice to look at a conference schedule and see a lot of topics that are totally new to me.

One thing I’m curious about is Scala. I’ve been working with a research group lately on a project using Intelligent Agents, and through that was introduced to the idea of Functional Programming. Somehow I missed seeing this in my undergraduate days, though I remember my peers complaining about Scheme in one of the senior-level computer science courses.

Some of the talks on Groovy and Grails seem interesting, too. The Ruby on Rails movement has certainly sparked some innovation in the Web Development community, and I like seeing those ideas cross-ported into the technologies I have more of an affinity for, such as Java and ColdFusion. Having recently built a somewhat painful full-scale Java application, there may be something useful here.