Alan Grosskurth
Research

Automated C/C++ fact extraction with BFX

I extract quite a bit of software using the BFX pipeline. In order to save time, I have written a script and a Makefile to automate the process. This page describes the steps I follow.

Starting out

These instructions describe how to perform an automated extraction of a software system using the BFX pipeline. I assume you have already installed the BFX pipeline and set the $QLDX environment variable to point to the location where you have installed QLDX.

  1. Start working in an empty directory, say ~/extract:
    mkdir ~/extract
    cd ~/extract
    
  2. Download the bfx-init script and the Makefile, and make the bfx-init script executable:
    wget http://www.cs.uwaterloo.ca/~agrossku/2005/automated-bfx/bfx-init
    wget http://www.cs.uwaterloo.ca/~agrossku/2005/automated-bfx/Makefile
    chmod 755 bfx-init
    
  3. Download the source for the package you want to extract, say ctags-5.5.4.tar.gz:
    wget http://www.cs.uwaterloo.ca/~agrossku/2005/automated-bfx/ctags-5.5.4.tar.gz
    
  4. Extract the package:
    tar xvzf ctags-5.5.4.tar.gz
  5. Build the software and generate a template containment structure:
    ./bfx-init ctags-5.5.4
  6. Build and view the landscape:
    make PKG=ctags-5.5.4

You should now be looking at the landscape in LSEdit:

Viewing the landscape in LSEdit

You can perform automated extractions of as many software systems as you want in this same directory. If you would like to remove the intermediate build files, you can run:

make clean PKG=ctags-5.5.4

Modifying the containment structure

If you click on the top-level subsystem show above, you will be taken inside and should see something like this:

Viewing the structure of the landscape in LSEdit

You will probably want to modify the containment hierarchy in order to make the system easier to comprehend. You have two options for this:

  1. Use LSEdit to drag-and-drop entities.
  2. Edit the contain file by hand.

If you choose the first option, you will be modifying ctags-5.5.4.ls.ta directly (although you can save the landscape under an alternate name). However, if you want to apply the containment structure to a different version of the program, you will have to use grep to extract the lines which begin with the word "contain" from this file and apply this containment to the new version. You will notice that your subsystems have strange names like "Entity#5", although they will have the labels you gave them in LSEdit.

Hence, because of these complications, I prefer the second option. In order to see this in action, do the following:

  1. Open up the ctags-5.5.4.contain file in a text editor. It should look like this:
    contain ctags-5.5.4 ctags-5.5.4/args.o
    contain ctags-5.5.4 ctags-5.5.4/asm.o
    contain ctags-5.5.4 ctags-5.5.4/asp.o
    contain ctags-5.5.4 ctags-5.5.4/awk.o
    contain ctags-5.5.4 ctags-5.5.4/beta.o
    contain ctags-5.5.4 ctags-5.5.4/c.o
    contain ctags-5.5.4 ctags-5.5.4/cobol.o
    contain ctags-5.5.4 ctags-5.5.4/eiffel.o
    contain ctags-5.5.4 ctags-5.5.4/entry.o
    contain ctags-5.5.4 ctags-5.5.4/erlang.o
    contain ctags-5.5.4 ctags-5.5.4/fortran.o
    contain ctags-5.5.4 ctags-5.5.4/get.o
    contain ctags-5.5.4 ctags-5.5.4/html.o
    contain ctags-5.5.4 ctags-5.5.4/jscript.o
    contain ctags-5.5.4 ctags-5.5.4/keyword.o
    contain ctags-5.5.4 ctags-5.5.4/lisp.o
    contain ctags-5.5.4 ctags-5.5.4/lregex.o
    contain ctags-5.5.4 ctags-5.5.4/lua.o
    contain ctags-5.5.4 ctags-5.5.4/main.o
    contain ctags-5.5.4 ctags-5.5.4/make.o
    contain ctags-5.5.4 ctags-5.5.4/options.o
    contain ctags-5.5.4 ctags-5.5.4/parse.o
    contain ctags-5.5.4 ctags-5.5.4/pascal.o
    contain ctags-5.5.4 ctags-5.5.4/perl.o
    contain ctags-5.5.4 ctags-5.5.4/php.o
    contain ctags-5.5.4 ctags-5.5.4/python.o
    contain ctags-5.5.4 ctags-5.5.4/read.o
    contain ctags-5.5.4 ctags-5.5.4/rexx.o
    contain ctags-5.5.4 ctags-5.5.4/routines.o
    contain ctags-5.5.4 ctags-5.5.4/ruby.o
    contain ctags-5.5.4 ctags-5.5.4/scheme.o
    contain ctags-5.5.4 ctags-5.5.4/sh.o
    contain ctags-5.5.4 ctags-5.5.4/slang.o
    contain ctags-5.5.4 ctags-5.5.4/sml.o
    contain ctags-5.5.4 ctags-5.5.4/sort.o
    contain ctags-5.5.4 ctags-5.5.4/sql.o
    contain ctags-5.5.4 ctags-5.5.4/strlist.o
    contain ctags-5.5.4 ctags-5.5.4/tcl.o
    contain ctags-5.5.4 ctags-5.5.4/verilog.o
    contain ctags-5.5.4 ctags-5.5.4/vim.o
    contain ctags-5.5.4 ctags-5.5.4/yacc.o
    contain ctags-5.5.4 ctags-5.5.4/vstring.o
    contain ctags-5.5.4 ctags-5.5.4/readtags.o
    
  2. Modify it so it looks like this:
    contain ctags-5.5.4 front-end
    contain ctags-5.5.4 core
    contain ctags-5.5.4 tags
    contain ctags-5.5.4 util
    contain ctags-5.5.4 lang
    contain ctags-5.5.4 unused
    contain front-end ctags-5.5.4/options.o
    contain front-end ctags-5.5.4/args.o
    contain core ctags-5.5.4/lregex.o
    contain core ctags-5.5.4/main.o
    contain core ctags-5.5.4/parse.o
    contain core ctags-5.5.4/read.o
    contain tags ctags-5.5.4/entry.o
    contain tags ctags-5.5.4/sort.o
    contain util ctags-5.5.4/keyword.o
    contain util ctags-5.5.4/routines.o
    contain util ctags-5.5.4/strlist.o
    contain util ctags-5.5.4/vstring.o
    contain lang regex-based
    contain lang non-regex-based
    contain unused ctags-5.5.4/readtags.o
    contain non-regex-based keyword-hash
    contain non-regex-based no-keyword-hash
    contain keyword-hash c
    contain c ctags-5.5.4/get.o
    contain c ctags-5.5.4/c.o
    contain regex-based ctags-5.5.4/cobol.o
    contain regex-based ctags-5.5.4/html.o
    contain regex-based ctags-5.5.4/jscript.o
    contain regex-based ctags-5.5.4/rexx.o
    contain regex-based ctags-5.5.4/slang.o
    contain regex-based ctags-5.5.4/yacc.o
    contain keyword-hash ctags-5.5.4/asm.o
    contain keyword-hash ctags-5.5.4/beta.o
    contain keyword-hash ctags-5.5.4/eiffel.o
    contain keyword-hash ctags-5.5.4/erlang.o
    contain keyword-hash ctags-5.5.4/fortran.o
    contain keyword-hash ctags-5.5.4/pascal.o
    contain keyword-hash ctags-5.5.4/perl.o
    contain keyword-hash ctags-5.5.4/python.o
    contain keyword-hash ctags-5.5.4/sh.o
    contain keyword-hash ctags-5.5.4/sml.o
    contain keyword-hash ctags-5.5.4/sql.o
    contain keyword-hash ctags-5.5.4/verilog.o
    contain no-keyword-hash ctags-5.5.4/asp.o
    contain no-keyword-hash ctags-5.5.4/awk.o
    contain no-keyword-hash ctags-5.5.4/lisp.o
    contain no-keyword-hash ctags-5.5.4/lua.o
    contain no-keyword-hash ctags-5.5.4/make.o
    contain no-keyword-hash ctags-5.5.4/php.o
    contain no-keyword-hash ctags-5.5.4/ruby.o
    contain no-keyword-hash ctags-5.5.4/scheme.o
    contain no-keyword-hash ctags-5.5.4/tcl.o
    contain no-keyword-hash ctags-5.5.4/vim.o
    
    Remember to make sure that each file and each subsystem belongs to exactly one subsystem.
  3. If LSEdit is open, close it. Rebuild and view the landscape:
    make PKG=ctags-5.5.4
  4. Click on the top-level subsystem, and you should now see something like this:

    Viewing the structure of the landscape in LSEdit

When I am exploring the system, I usually keep the contain file open in an editor to tweak the containment hierarchy as necessary. I then rerun the make command to update the landscape. It is a bit of a pain since I need to reopen LSEdit each time, but I feel I am more productive like this than using LSEdit directly to edit the containment.