Peace, Caffeine, Linux

Software, Biology, and Technology in the new millennium.

Secure Java Programming

Fireside I gave a presentation titled "Secure Java Programming" on November 14, 2007 to the most excellent Philadelphia Area Java Users' Group at the Unisys East Coast Development Center in Malvern, PA. This is a great JUG - every time I have gone to one of their meetings the turnout has been huge and the speakers first-class. If you are interested in Java and work in the Greater Philadelphia area, it's definitely worth making an effort to attend.

During my talk I reviewed:

  • Java platform security features
  • Online resources related to security
  • Common vulnerabilities exploited by attackers
  • A 10-part "Leet Skillz" course that reviewed vulnerable code examples and how to mitigate the issues

This 6.5 meg PowerPoint is the original presentation plus some additional material on bypassing access modifiers using Reflection - enjoy!

Download java_programming_security.pps

January 29, 2008 in Development | Permalink | Comments (2) | TrackBack (0)

Regular Expression for Type 4 JDBC URL with Embedded IP Address or Hostname

We eat a lot of JDBC Type 4 database driver connections at Portico Systems. One of the challenges during installations is ensuring the deployment team gets the syntax of these URL's correct. We have some decent validation of them embedded already, but recently I decided to make it bullet proof.

Since Java 1.4 Regular Expressions were baked into the Java stack. For example the Pattern class. This is a beautiful thing. (Before 1.4 libraries like Jakarta ORO were used to get the power of regex in our code.)

So I was thinking that surely out there on the inker-net somebody would have posted clever regexes to handle JDBC URL's as well as IP quad format addresses. What I discovered were some bits and pieces - some wrong, some close, and none really complete.

Theregexcoach So here is my contribution; I hope it is useful to you. Rather than walk you through this beast I am just going to put it out there. If you dance with this pattern, I recommend the wonderful Regex Coach, a Common Lisp based (!) native Windows app. I wish I had something like it on OS X and Linux.

Back to my pain...  the biggest challenge to wrestle was that an Oracle Type 4 URL could have an IP address, or a hostname embedded in it.  Both of these are very tricky to handle in a regex. The pattern I put together comes really close to perfection, in fact it may just be there already but I don't want to jinx this blog. I am guessing I may have missed something, but so far all my JUnit tests are passing.

What's the big whoop bout IP Addresses in "dotted quad"? They are tricky because of the limited range of each number - 1 to 255 for each quad. And if you really wanted to get technical, there are certain reserved ranges you might want to lock out, but I just don't have that kind of time. Oh, and I am not worrying about IPv6 with this regex pattern.

And hostnames, aren't they easy? NOPE. How many people out there really know the spec? I was forced long ago, when operating my own DNS servers, to appreciate RFC 952 and RFC 882. They differ on some points, so I am not entirely clear on how it all shakes out... for example RFC 952 disallows single character names and RFC 882 allows them. But they agree on most of the high level basics.

From RFC 952:

  1. A "name" (Net, Host, Gateway, or Domain name) is a text string up
   to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus
   sign (-), and period (.).  Note that periods are only allowed when
   they serve to delimit components of "domain style names". (See
   RFC-921, "Domain Name System Implementation Schedule", for
   background).  No blank or space characters are permitted as part of a
   name. No distinction is made between upper and lower case.  The first
   character must be an alpha character.  The last character must not be
   a minus sign or period. (...) Single character names
   or nicknames are not allowed.

Hmm... I have to wonder if the 24 position thing is really a limit anymore, and aren't their root name servers on the internet with single position names? From RFC 882:

   Appendix 1 - Domain Name Syntax Specification

   The preferred syntax of domain names is given by the following BNF
   rules.  Adherence to this syntax will result in fewer problems with
   many applications that use domain names (e.g., mail, TELNET).  Note
   that some applications described in [14] use domain names containing
   binary information and hence do not follow this syntax.

      <domain> ::=  <subdomain> | " "

      <subdomain> ::=  <label> | <subdomain> "." <label>

      <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]

      <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>

      <let-dig-hyp> ::= <let-dig> | "-"

      <let-dig> ::= <letter> | <digit>

      <letter> ::= any one of the 52 alphabetic characters A through Z
      in upper case and a through z in lower case

      <digit> ::= any one of the ten digits 0 through 9

   Note that while upper and lower case letters are allowed in domain
   names no significance is attached to the case.  That is, two names
   with the same spelling but different case are to be treated as if
   identical.

   The labels must follow the rules for ARPANET host names.  They must
   start with a letter, end with a letter or digit, and have as interior
   characters only letters, digits, and hyphen.  There are also some
   restrictions on the length.  Labels must be 63 characters or less.

I have attempted to faithfully adhere to both these specs where they agree, and where they do not - I allow for 63 position as well as single position names.

Finally, note that my regex creates 3 matching groups:

  1. hostname or IP address
  2. port
  3. Oracle SID

These groupings are your handles to programmatically pluck out the URL pieces you want to chew on.

Let me know if I missed something! (NOTE -  my blog was having trouble with the long string.  I have chopped it into five bits.  You must put the following five lines into one long string to create the regex, or click here to get it in a downloaded text file.)


^jdbc:oracle:thin:@((?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))

|(?:(?:(?:(?:[A-Z|a-z])(?:[\w|-]){0,61}(?:[\w]?[.]))*)

(?:(?:[A-Z|a-z])(?:[\w|-]){0,61}(?:[\w]?))))

:([0-9]{1,5}):([A-Z|a-z|_|#|$]{1,8})$

Once you have the previous five lines stitched together into one regex, you should be able to match the following example URL's, and break them down into the three groupings:

  • jdbc:oracle:thin:@www.porticosys.com:12345:orcl
  • jdbc:oracle:thin:@216.3.66.175:1:orclABCD

May 01, 2006 in Development | Permalink | Comments (1)

Oracle SQL Developer Part II - Annoyances

Sqldeveloper2 I have had a chance to use Oracle SQL Developer rather heavily in the last two weeks, on Windows, OS X, and Linux running under Java 5. All in all I am still very positive about it. It is definitely usable for "professional" work, and I keep discovering more "nice to haves" in the UI that make it a very workable tool for most SQL slingers.  It is clear that Oracle really put a lot of effort into building this tool.

When I wrote my original post about SQL Developer, I had already identified some quirks that bugged me, the biggest being the way the context sensitive pop-ups don't seem to fire on "aliases" in SQL. I now have some other new annoyances identified, some of which may be my own ignorance about the software, but some I think are real beefs that I hope they will address in a future build:

  • font anti-aliasing not used on all parts of the UI
  • code completion always upper case
  • "Results" view should allow sorting by clicking on columns
  • "Format SQL" processes the whole buffer, not just what is selected
  • some weirdness between platforms and navigation (page down on Linux seemed to do strange things for me sometimes)
  • no easy way to jump to end of data - scroll bar behavior is a little strange, it seems like it opportunistically fetches additional rows as you scroll down
  • When listing objects it only shows you a limited length list.  To "Show all" you have to do way too many clicks:
    1. Double-click on Show More
    2. Click on "Show All" radio button
    3. Click on "OK"
  • find feature should be like firefox, not a dialog that always pops up in the middle of the window
  • bug in trigger view shows triggers more than once
  • too easy to accidentally close a tab - there is this cool  little "X" that pops up, but I find myself accidentally hitting it frequently
  • can't visually tell if triggers are disabled
  • can't copy and past from an arbitrary selection on a grid such as the list of dependent objects using Mac style copy - you have to use a [Ctrl-C] instead
  • SQL should be easily available for all views (i.e. the dependencies view - how did the tool get that data?)
  • Can't tell what tabs have active sessions
  • When you click on objects in the "navigator", it sometimes non-intuitive what is going to happen (as in whether or not a new tab is going to open, or is it going to replace the focus of the current tab)

Sounds like a lot of complaints, but all-in-all I love the tool and will definitely keep using it heavily.

April 24, 2006 in Development | Permalink | Comments (0)

Oracle SQL Developer

At Portico Systems we do a lot of work with Oracle 8i, 9i, and 10g.  I have been working with Oracle RDBMS's since personally plunking down $3000 of my own hard earned cash way back around 1992 to get Oracle 6 RDBMS for the Mac, join their developer program, and get Oracle Card (ouch!) and other tools and utilities.  I am not kidding - this was the full blown database, running on the Mac OS back in the early 90's!

It had always been a fact of life that Oracle GUI client applications were (how to put this nicely?) rather lame compared to what Microsoft started to put out there for SQL Server.  In the 90's many of us found ourselves explaining (over and over) to the Microsofties that SQLPlus was "lame for a reason." It was a "least common denominator" application, because the Oracle stack was truly cross platform.

Then there were the early GUI applications from Oracle, which were some of the first to leverage Java.  Oracle had their own cool look and feel that had a curvy sculpted JFC/IFC (precursors to Swing) look to them.  But they really weren't showcase Java GUI applications of their time.  (remember SQL Worksheet?)

In the end many Oracle developers use third party tools to access and interact with the database. Years ago we discovered PL/SQL Developer which I felt had 90% of TOAD's functionality for much less cost.  So Portico has standardized on PL/SQL Developer ever since.  It really is a fantastic tool.  (Many of the other developers at Portico still use PLEdit and Golden from Benthic Software.)

My only beef with PL/SQL Developer is that being a Windows native app, I can't use it on my Macs or Linux based desktops and servers.  For those machines Aqua Data Studio, which rides on top of the JDBC layer and is compatible with many back-ends, is a contender.  I have used it with great success on OS X, Linux, and Windows against SQL Server, Oracle, and MySQL.

So why the post today? There is big news from Oracle this month!  They have come out with an all Java based tool called SQL Developer.

Sqldeveloperscreenshot

I have only installed it on my Windows laptop so far, and will have it up and running on other OS'es soon.  It looks really great. After a little messing with it today I found myself wondering...  what took them so long!!!

The tool does not match PL/SQL Developer's functionality, but it is a great second tier utility to throw on machines where you don't want or need to pay for something.  I don't know it well enough to really give a thorough review, but here are a couple points as I see it so far:

Pros

  • Snappy Swing based UI running under Java 5
  • Nice integration of a "reporting" feature with lots of sample reports
  • A cool "Snippets" tab with pasteable, uh, "snippets"
  • 100% Java!!!

Cons

  • Context sensitive pop-ups don't seem to fire on "aliases" in SQL (bummer!)
  • XML export writes element text out as CDATA (why???)
  • Export to outside tools (i.e. Excel) is a little too difficult

All in all, I am really excited about SQL Developer.  It's definitely worth checking out.  Back-end it with the even more amazing Oracle Database 10g Express Edition (also free!), and a developer can have a full Oracle 10g stack with a very functional graphical SQL tool.  Life is good!

March 22, 2006 in Development | Permalink | Comments (0)

It's official - Swing Rocks and Rulz

CarolinaswingWe have known it for years, and now some would say there is proof:

Official: Swing is the Dominant GUI Toolkit

As with all IT surveys, enjoy with some salt... and let me give you my own take on Swing.

My company has been using Swing in production application use since 1999. For those of you who don't know, Swing is the cross platform GUI framework that has been part of the core Java platform releases since 1.2.  It was available before Java 1.2 as a separate download - in fact, in the old days, we used to deploy the swing jar separately with our applications.

FMG's first GUI Java production applications were written using Java AWT.  Been there, done that, don't wanna go back!  Swing was a breath of fresh air, and we jumped on the opportunity to port our code from AWT to Swing.

I am frequently asked how Swing has treated us.  It has been great.  We have had ZAROO Swing related bugs since first rolling it out.  The Swing based portions of our applications are pounded on daily by hundreds of end users.  Millions of transactions have been put through these apps.  And no problems.  To top if off, I can run our apps on OS X, Linux, and Windows of all flavors.

We kept hearing in the IT press that "Java was dead on the desktop", that Swing had issues, and worse.  Meanwhile, FMG was plugging away with one Swing based application after another.  I never understood why popular opinion checked out hard on desktop Java for so many years. 

HTML/HTTP is not adequate for all business applications - sometimes a user really needs a thick interface.  (AJAX will have to be a topic some other day.)  I think once Java became the de facto language of ecommerce - which was driven almost entirely from the server side - Java on the desktop became an afterthought for many.  But Swing was always out there, and it was a solid way to get a business user a great thick GUI experience.

I read (and reread) .NET Framework Essentials when it first came out.  I was amused (but not shocked) to see that Microsoft had in large part copied many of the best aspects of Java, and at many layers of the stack, from architecture all the way down to the syntax and grammar of the core language (C# to be specific).  What really blew my mind was that they clearly based much of the design of Windows Forms on the Swing API's.  If you knew Swing, Windows Forms was not going to be hard to learn.

Don't underestimate the import of Swing being one of the dominant GUI class libraries in use today.  I think this is another sign that Java on the client is back on everyone's radar... and that rocks!  Hats off to Sun and the Swing team for their much deserved success!

October 19, 2005 in Development | Permalink | Comments (0)

Nano Nano Time

It's time to talk about Nano Time.  Not Nanoo Nanoo people.  NANO NANO.

Ibm_clock

I have had a lot of trouble finding info on the new JDK 1.5 System.nanoTime(). I read that it is not synchronized with UTC which is not an issue if you are profiling.

From Wikipedia:

A nanosecond (ns) is equal to 10-9 of a second.

  • It is only infrequently put into everyday use. In technical situations it is however a very common unit, especially in computers, telecommunications, pulsed lasers, and some areas of electronics.
  • In 1 ns, light travels exactly 299.792458 mm in a vacuum (via the definition of the metre). But the speed of light is slower in materials, indicated by an index of refraction n greater than 1. Thus in air (n = 1.003), light travels about 298.9 mm in 1 ns, but it travels only about 225.4 mm in water (n = 1.33) each nanosecond.

Wow!  That is short.  So a nanosecond is 1 millionth of a millisecond.  Multiply it by 1000 and you are operating with microseconds, which is usually a little more useful.  A microsecond is 1 millionth of a second.

But how is this implemented on Linux?

First a little java program:

 

public class NanoTest
{
    public static void main( String[] p_args )
    {
        System.out.println( "TRACE 1 - nanos = " + System.nanoTime() );
        System.out.println( "TRACE 2 - nanos = " + System.nanoTime() );
    }
}

Next up, compile it:

sfraser@tennessee:~/Source/NanoTest$ java -version
java version "1.5.0_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05)
Java HotSpot(TM) Client VM (build 1.5.0_04-b05, mixed mode, sharing)
sfraser@tennessee:~/Source/NanoTest$ javac NanoTest.java

Run it under the control of strace so that we can log how it interacts with the linux kernel:

sfraser@tennessee:~/Source/NanoTest$ strace -ff -o NanoTest.log java NanoTest
TRACE 1 - nanos = 1126744308394574000
TRACE 2 - nanos = 1126744308395374000
sfraser@tennessee:~/Source/NanoTest$

After some comparisons of the strace output with and without the calls to System.nanoTime(), the main difference I noticed (not easy - there are MANY calls to gettimeofday in strace output!) was there appeared to be some extra calls to gettimeofday() in my code that used System.nanoTime():

gettimeofday({1126744308, 394574}, NULL) = 0
write(1, "TRACE 1 - nanos = 11267443083945"..., 37) = 37
write(1, "\n", 1)                       = 1
gettimeofday({1126744308, 395374}, NULL) = 0
write(1, "TRACE 2 - nanos = 11267443083953"..., 37) = 37
write(1, "\n", 1)

So it would appear that Sun's JDK on Linux calls gettimeofday to get the, uh, time of day. "gettimeofday" is what is known as a "syscall" in Linux - a part of the API that the Linux kernel presents to your application.  In Linux the "syscalls" are usually implemented as a function called "sys_*", which frequently calls another function called "do_*" that actually does the real work.

So let's do some digging - this is Linux, let's look at the source!  First of all, make sure you have the kernel sources actually installed.  I am on Kubuntu linux, which uses APT as its package management technology (you may be familiar with RPM on Redhat), so installing the sources is as easy as this (really):

sudo apt-get install linux-source-2.6.10

 

Next up, let's search the source and figure out where our gettimeofday is implemented:

 

find /usr/src/linux-source-2.6.10/ -name *.c -exec grep "gettimeofday" '{}' \; -ls > ~/scratch/GetTimeOfDay.txt

Now...  search through the text file and figure out where gettimeofday is defined, used, etc...  WAIT A MINUTE.  There has to be an easier way, right?  OK OK, I know there are easier ways...  instead of the brutality above, how about you just check out LXR, it's SWEET.  For instance, click here for access to the 2.6.10 source code.

After a little digging, you would stumble across this interesting source file:

1 /*
2 *  linux/kernel/time.c
3 *
4 *  Copyright (C) 1991, 1992  Linus Torvalds
5 *
6 *  This file contains the interface functions for the various
7 *  time related system calls: time, stime, gettimeofday, settimeofday,
8 *                             adjtime
9 */

A little searching in it will find the syscall here in linux/kernel/time.c, which as expected calls a "do_gettimeofday":

100 asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone __user *tz)
101 {
102         if (likely(tv != NULL)) {
103                 struct timeval ktv;
104                 do_gettimeofday(&ktv);
105                 if (copy_to_user(tv, &ktv, sizeof(ktv)))
106                         return -EFAULT;
107         }
108         if (unlikely(tz != NULL)) {
109                 if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
110                         return -EFAULT;
111         }
112         return 0;
113 }

Finally, the smoking gun, do_gettimeofday, is implemented for the Intel architecture in linux/arch/i386/kernel/time.c:

90 /*
91 * This version of gettimeofday has microsecond resolution
92 * and better than microsecond precision on fast x86 machines with TSC.
93 */
94 void do_gettimeofday(struct timeval *tv)

Note the comment!!!  So do_gettimeofday basically fills in a "timeval" struct which is defined in time.h:

18 struct timeval {
19         time_t          tv_sec;         /* seconds */
20         suseconds_t     tv_usec;        /* microseconds */
21 };

A ha!!!  Microseconds!  However if you dig further you will find accommodations in the code made to handle time in nanoseconds.  In fact, the struct being copied from is meant to handle nanosecond resolution:

12 struct timespec {
13         time_t  tv_sec;         /* seconds */
14         long    tv_nsec;        /* nanoseconds */
15 };

So if do_gettimeofday is starting with a struct that records time to the nanosecond, where does it round this up to microseconds?  Here:

124                 usec += (xtime.tv_nsec / 1000);

So it is all starting to come together.  If you go back and look really closely at the strace output, you can see the two parameters coming back from the gettimeofday call being combined and multiplied by 1000 to produce the output our program generated.  In other words this call:

gettimeofday({1126744308, 394574}, NULL) = 0

Resulted in this output from the Java program:

TRACE 1 - nanos = 1126744308394574000

So the 1126744308 must have been the number of seconds, the 3974574 was microseconds, and Java 1.5 on my desktop simply did this: 

((seconds * 1000000) + microseconds) * 1000 = nanoseconds

So that's the end of the road for today, hope you enjoyed this trek into the wonders of the Linux kernel!

September 18, 2005 in Development | Permalink | Comments (0)

Better Living through Algorithms

IntrotoalgosOver the years some of my favorite programming tasks required the judicious application of algorithms.  15 years ago when memory was scarce, I was packing encrypted data into memory blocks that were constrained by the old Intel architecture and the frightful Win16 environment. Lempel-Ziv-Welch compression and Huffman coding were two of the algorithms I used back then.  We had to compress - otherwise thing just wouldn't fit!  (BTW, remember those "RAM Doubler" products?)

I also had to develop some quicksort implementations that did fast sorting in a static heap.  Chapter 8 of Introduction to Algorithms (1st edition) was essential for this work.

So why is Better Living through Algorithms on my mind today?

My recent work at FMG revolves around a technology we call "HyperMemory", which is a building block for some of our new products.  (Yes, I know ATI is already using the name!)

I am absolutely loving being involved in this effort, because it is taking me back to the old days...  except now, no 16-bit pointer nightmares making me miserable.

So I started hacking away on an engine that can load data from our Business Objects into memory, index it, query it, and so on.  After some review of the excellent Java Performance Tuning, Second Edition, I implemented a modified Digital Trie that links to a Ternary Tree.  This was a "modified" tree because instead of functioning as a Set I needed this index to support multiple hits on any given node in the tree.  So each node in my tree has a bucket as opposed to one value.

Before adding the indexing, benchmarking of HyperMemory searches was measured in "milliseconds"  (1/1000 of a second = 1 millisecond).  After I added the indexing layer, I had to start using the the cool new Java 5 System.nanoTime() call.

That's right.  NANO TIME.  Better living indeed!

I ended up rounding up to microseconds (1/1,000,000 of a second) to make more sense of the numbers - not to mention that depending on the underlying OS, this call may give you nanosecond resolution but the accuracy is actually to the microsecond.  I hope to expand on this in another post some day.

OK, so having "been there/done that" back in the 90's, using unmanaged code in the pathetic Win16 environment I gotta tell ya...  being here 15 years later, doing it all over again - sans Win16/DOS, sans 16-bit headaches...  It's great to be alive in 2005!

September 06, 2005 in Development | Permalink | Comments (0)

What am I up to?

    follow me on Twitter

    Recent Posts

    • Feeding the Pigs - Soundscape
    • World Series of Birding 2009 - The Wet Week of Scouting
    • Cooper's Hawk Nest Building
    • Tufted Titmouse
    • World Series of Birding 2008 and more 15 Second Batches of Fame
    • Secure Java Programming
    • Northern Goshawk kills Australorp Chicken
    • Cedar Waxwing Nestlings in September
    • Moving Day Video - 7 Hours in 2 Minutes
    • Red-bellied Woodpecker Excavating Nest Hole

    Categories

    • Biology
    • Development
    • Homesteading
    • Music
    • Technology

    Archives

    • July 2009
    • May 2009
    • April 2009
    • May 2008
    • January 2008
    • November 2007
    • September 2007
    • April 2007
    • January 2007
    • September 2006

    More...

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    About

    My Photo
    Subscribe to this blog's feed

    • Scott Fraser's Facebook profile