Website Software

Beta Testing! In as little as 5 minutes, you can begin building your own website!

Blog

Working With ConceptNet
Tue, Jul 27, 2010

MIT's ConceptNet: http://csc.media.mit.edu/conceptnet offers a very large dataset of concepts and their relationship to other concepts across 27 different relationship types.  For example, a relationship is pictured below, stating that the class "fish" has the relationship of "being located at" the other class "water."  Each of these relationships has a score assigned to it based on how well other members of MIT's OpenMind initiative judge the relationships veracity, the pictured score of 151 is relatively high, indicating that this is a truthful relationship.

Using the score metric of each edge on the graph (the 151 above), we can sum the incoming scores to each node.  For example, if the relation "people drink water" is represented as a directed edge towards water with a score of 49, the water node would have a total incoming score of 200 (151 49).  We can see the top 10 scores for concept in-degree below:

  1. fun (1,385)
  2. water (1,010)
  3. house (923)
  4. store (896)
  5. city (837)
  6. kitchen (773)
  7. sleep (762)
  8. animal (751)
  9. eat (732)
  10. learn (720)

Graphing the distribution of score in-degree, we see the expected long-tail effect.

The long tail effect seems to take place after in-node score decreases below 100.  On this note, we should also consider the score relationships of the edges (relationships).  When we rank the relationships based on the sum of the score weights assigned to them, we see the following ranking:

  1. Is A (81,444)
  2. At Location (65,080)
  3. Has Property (60,671)
  4. Used For (60,504)
  5. Capable Of (30,117)
  6. Has Prerequisite (24,761)
  7. Has Subevent (23,795)
  8. Conceptually Related To (19,968)
  9. Causes (17,797)
  10. Has A (17,753)
  11. Motivated By Goal (14,297)
  12. Receives Action (8,177)
  13. Desires (5,872)
  14. Part Of (5,465)
  15. Causes Desire (4,871)
  16. Located Near (4,813)
  17. Defined As (3,781)
  18. Has First Subevent (3,663)
  19. Has Last Subevent (2,528)
  20. Made Of (1,906)
  21. Similar Size (1,102)
  22. Created By (706)
  23. Symbol Of (155)
  24. Has Pain Intensity (78)
  25. Instance Of (65)
  26. Inherits From (50)
  27. Has Pain Character (43)

Using the long-tail class cutoff, we can visualize ConceptNet with these top 540 "connector" nodes not found on the long-tail, yielding the following, highly interconnected graph:

However, if we constrict the graph only using the "Is A" relationship, it becomes much less connected:

 

The "Located Near" relationship yields a disconnected graph:

 

And the "Causes" relation:

While there is obviously a lot more to learn from ConceptNet than making these graphs, I hope that this offers a good introduction and motivation for anyone else interested in ConceptNet.

Web News

World Trade Center Complex Is Rising Rapidly
Despite setbacks and public cynicism, the puzzle that is the new World Trade Center complex is being pieced together — rapidly.
6:49am Sunday, Sep 5th 2010 - www10.nytimes.com

Sounds of Africa Take Over London's Stages
This year, from Sept. 10 to 25, the London African Music Festival expands to more venues and boroughs than ever before, presenting artists from 23 countries.
6:00am Sunday, Sep 5th 2010 - intransit.blogs.nytimes.com

Asian-Americans Climb Fashion Industry Ladder
At the New York Fashion Week that begins on Thursday, many promising new designers are of Asian descent, an important demographic shift on Seventh Avenue.
5:52am Sunday, Sep 5th 2010 - www10.nytimes.com

Under-Capacity of Chinas Infrastructure: Ways to Invest in Expansion
5:50am Sunday, Sep 5th 2010 - seekingalpha.com

Where to get great coffee on I-95
Just in time for today's road-trippers, snowbirds getting an early start or parents driving kids to college comes this list of great java stops ...
5:32am Sunday, Sep 5th 2010 - travel.usatoday.com

More Web News...

Predicted Volatile Stocks

Apple [AAPL]

Hewlett-Packard [HPQ]

Dell [DELL]

Microsoft [MSFT]

Bank of America [BAC]

More Finance...