General

How did the GeoAPI project get started, and what is its history?

The GeoAPI project emerged from the collaboration of several free software projects and from the work on various specifications at the Open Geospatial Consortium (OGC).

You can follow the early, pre-history of GeoAPI by reading the following three posts to the DigitalEarth.org website; at this point it had no name, only a goal of bringing together multiple Java GIS projects.

As you can see in part III, the OGC had just announced a Geographic Objects initiative which intended to define Java interfaces for geographic software. This followed earlier work on the OGC Implementation Specification 01-009 Coordinate Transformation Services which included interfaces defined in the org.opengis namespace ultimately adopted by GeoAPI.

The GeoAPI project eventually formed to pursue this work. The interfaces defined in the OGC specification 01-009 became GeoAPI version 0.1. GeoAPI 1.0 was released with the draft of OGC specification 03-064 GO-1 Application Objects. In May 2005, the final draft of the GO-1 specification, which included GeoAPI interfaces, was accepted as an OGC standard and the matching version of GeoAPI was released as version 2.0.

The GeoAPI 3.0 working group of the OGC has formed in January 2009 to formalized and continue the work of standardizing the most stable interfaces produced by the GeoAPI project. The GeoAPI specification produced by this group became an OGC standard in April 2011.


What is the relationship between GeoAPI and OGC?

GeoAPI is closely tied to the OGC both in its origins and in its ongoing work.

The GeoAPI project is a collaboration of participants from various institutions and software communities. The GeoAPI project is developing a set of interfaces in the Java language to help software projects produce high quality geospatial software. The core interfaces follow closely the specifications produced in the 19100 series of the International Organization for Standardization (ISO) and by the OGC. The interfaces use the org.opengis namespace and copyright to the code is assigned to the OGC. The project started with the code produced by the OGC Implementation Specification 01-009 Coordinate Transformation Services and refactored this code in collaboration with the standardization work surrounding the OGC specification 03-064 GO-1 Application Objects.

The GeoAPI working group of the OGC is a separate effort made up principally of members of the OGC and formed to continue the work of formalizing the interfaces developed by the GeoAPI project as ratified standards of the OGC. The working group decided to start the GeoAPI Implementation Specification as a new standard focused exclusively on the interfaces produced by the GeoAPI project. In acknowledgment to the earlier work and to match the numbering scheme of GeoAPI, the first specification released under this name carry the 3.0 version number.


Why a standardized set of programming interfaces? Shouldn't OGC standards stick to web services only?

We believe that both approaches are complementary. Web services are efficient ways to publish geographic information using existing software. But some users need to build their own solution, for example as a wrapper on top of their own numerical model. Many existing software packages provide sophisticated developer toolkits, but each toolkit has its own learning curve, and one can not easily switch from one toolkit to another or mix components from different toolkits. Using standardized interfaces, a significant part of the API can stay constant across different toolkits, thus reducing both the learning curve (especially since the interfaces are derived from published abstract UML) and the inter-operability pain points.

The situation is quite similar to JDBC (Java DataBase Connectivity)'s one. The fact that a high-level language already exists for database queries (SQL) doesn't means that low-level programming interfaces are not needed. JDBC interfaces have been created as a developer tools in complement to SQL, and they proven to be quite useful.


With standardization of interfaces, aren't you forcing a particular implementation?

We try to carefully avoid implementation-specific API. Again, JDBC is a good example of what we try to achieve. JDBC is an example of successful interfaces-only specification implemented by many vendors. Four categories of JDBC drivers exists (pure Java, wrappers around native code, etc.). Implementations exist for Access, Derby, HSQL, MySQL, Oracle, PostgreSQL and many others.

The implementation flexibility is demonstrated by the GeoAPI-Proj.4 wrappers. Proj.4 is an open source C/C++ library performing map projections. The Proj.4 API is totally different to GeoAPI. Despite the major design differences and the different programming language, the geoapi-proj4 module successfully provides a view over Proj.4 functionalities through GeoAPI interfaces.

It is important to stress out that GeoAPI is all about interfaces. Concrete classes must implement all methods declared in their interfaces, but those interfaces don't put any constraint on the class hierarchy. For example GeoAPI provides a MathTransform2D interface which extends MathTransform. In no way do implementation classes need to follow the same hierarchy. Actually, in the particular case of MathTransforms, they usually don't! A class implementing MathTransform2D doesn't need to extend a class implementing MathTransform. The only constraint is to implement all methods declared in the MathTransform2D interface and its parent interfaces.


Why GeoAPI has some departures from ISO specifications? Shouldn't GeoAPI be strictly ISO-compliant?

The ISO 19103, 19111 and 19115 specifications define mostly data structures convertible to XML or database schemas. Indeed, the EPSG database schema follows closely the ISO 19111 structures. However those ISO specifications define very few operations. For example ISO 19111 provides data structures for describing accurately Coordinate Reference System objects, but said nothing about the methods to invoke for performing the actual operations on coordinate values. In terms of programming languages, the above-cited ISO specifications define only no-argument getter methods. Any methods performing calculation based on parameter values are unspecified.

Since GeoAPI targets programming languages rather than data formats, we provide some methods performing calculations. Rather than inventing our own, we fetch those methods from other OGC specifications - including retired specifications - as much as possible. Those methods are documented in various places:

GeoAPI defines also a few convenience methods for frequently used operations (for example fetching the Coordinate Reference System of an Envelope) and methods for inter-operability with the standard Java library. Those methods are also documented in the departures page.

Technical

Why don't you translate all OGC's UML into Java interfaces using some automatic script?

We tried that path at the beginning of GeoAPI project, and abandoned it. Automatic scripts provide useful starting points, but their output do not alway match the expectations of Java developers. For example a popular approach is to generate Java classes from the XML schemas using JAXB-related technologies. Unfortunately the XML schema defined by ISO 19139 is quite unusual, introducing a lot of redundant elements. In the example below, the right side shows what would looks like a Java code fetching the URL element from the XML fragment on the left side, if the Java API were derived automatically from ISO 19139 schema:

XML fragment Java code
<gmd:MD_Metadata>
  <gmd:identificationInfo>
    <gmd:MD_DataIdentification>
      <gmd:citation>
        <gmd:CI_Citation>
          <gmd:citedResponsibleParty>
            <gmd:CI_ResponsibleParty>
              <gmd:contactInfo>
                <gmd:CI_Contact>
                  <gmd:onlineResource>
                    <gmd:CI_OnlineResource>
                      <gmd:linkage>
                        <gmd:URL>http://www.opengeospatial.org</gmd:URL>
                      </gmd:linkage>
                    </gmd:CI_OnlineResource>
                  </gmd:onlineResource>
                </gmd:CI_Contact>
              </gmd:contactInfo>
            </gmd:CI_ResponsibleParty>
          </gmd:citedResponsibleParty>
        </gmd:CI_Citation>
      </gmd:citation>
    </gmd:MD_DataIdentification>
  </gmd:identificationInfo>
</gmd:MD_Metadata>

Note: the examples below ignore the iterations over collections and the checks for null values.

From automatic tools:

metadata.getIdentificationInfo().getMD_DataIdentification()
        .getCitation().getCI_Citation()
        .getCitedResponsibleParty().getCI_ResponsibleParty()
        .getContactInfo().getCI_Contact()
        .getOnlineResource().getCI_OnlineResource()
        .getLinkage().getURL();

From GeoAPI (defined by humans):

metadata.getIdentificationInfo().getCitation()
        .getCitedResponsibleParty().getContactInfo()
        .getOnlineResource().getLinkage().getURL();

We could derive the API from the UML in Rational Rose format instead than the XML schemas, but a lot of human intervention is still essential. The relationship between UML and Java interfaces is not always straightforward. For example:

  • Structures of type union are expressed in Java either by rearranging the interface hierarchy, by interface multi-inheritance or by omitting the data structure, on a case-by-case basis.

  • Resolution of some specification overlapping require human reading. For example ISO 19111:2007 section 3 specifies "in this international standard, normative reference to ISO 19115 excludes the MD_CRS class and its components classes" in order to avoid duplication. An automatic script would not have done this exclusion.

  • Some complexity introduced by historical standardization processes can be avoided. For example ISO 19115-2 defines imagery metadata which were not ready in time for the ISO 19115 schedule. Since new attributes could not be added to the existing ISO 19115 classes, they were added in ISO 19115-2 sub-classes of the same name (e.g. MI_Band extends MD_Band). GeoAPI merges those "geological layers".

  • Some additional interfaces or methods were introduced (see Why GeoAPI has some departures from ISO specifications?).

The changes applied by human intervention is documented in the departures page.


Why do you favor Collections over arrays as a return type?

For performance, more orthogonal API and more freedom on the implementer side.

Performance (including memory usage)

Some robust implementations will want to protect their internal state against uncontrolled changes. In such implementations, getter methods need to make defensive copies of their mutable attributes (see Effective Java). Since arrays are mutable objects, robust implementations would need to clone their arrays before to return them; otherwise nothing prevent a user from writing the following:

pointArray.positions()[1000] = null

and thus altering the PointArray state if positions() was returning a direct reference to its internal array. On the other side, collections can be read-only views over internal arrays, thus avoiding the need to clone their data in getter methods.

More orthogonal API

If a geometry is mutable (at implementer choice), an user may whish to add, edit or remove elements. With arrays as return types, we would need to add some add(...) and remove(...) methods in most interfaces. Using collections, such API weight is not needed since the user can write the following idiom:

pointArray.positions().add(someNewPosition);

The PointArray behavior in such case is left to implementers. It may throw an UnsupportedOperationException, keep the point in memory, stores its coordinates immediately in a database, etc.

In addition of keeping the API lighter, collections as return types also give us for free many additional methods like contains(...), addAll(...), removeAll(...), etc. Adding those kind of methods directly into the geometry interfaces would basically transforms geometries into new kind of collections and duplicates the collection framework work without its "well accepted standard" characteristic.

More freedom on implementer side

  • Collections are more abstract than arrays:
    • In the Java language, a collection can be a view over an array (using Arrays.asList(...) for example). The converse is impossible in the general case (Collection.toArray() doesn't create a view; it usually copies the array).
    • In the .NET language, an array is a collection but a collection is not always an array. onversions from an arbitrary collection to an array may require a copy, like in Java.
  • A collection can be read-only or not, at implementer choice. Java arrays are always mutable and need defensive copies (not to be confused with defensive copies of array or collection elements).
  • Collections allow one more degree of freedom for deferred execution or lazy data loading. Object creations can occur on a per-element basis in collection getter methods. In an array, the reference to all elements must be initialized before the array is returned.