vineri, 24 ianuarie 2014

Java Web Start - using parameters in URL for the application started via Web Start

A few days ago, I have been faced with the problem of having to pass some parameters (via a URL using a GET request) to a Java Web Start based application, something I haven't done in the past :)

This is actually quite simple, what you basically  have to do is to generate the JNLP file dynamically instead of using a static one.   There are several ways of generating a JNLP file dynamically, such as using a JSP file, servlet, SOA oriented service etc.

I will shortly present a way to generate this JNLP file by using a JSP.  I am assuming you already have a .jnlp file, and you want to adapt it to use parameters.

1. Rename the existing .jnlp file to jsp.

2. Adding the following directives at the beginning of the jnlp, now jsp file (will be used to compose the URL and will set the response headers):

<%
String     urlSt = request.getRequestURL().toString();    
String     jnlpCodeBase=urlSt.substring(0,urlSt.lastIndexOf('/'));    
String     jnlpRefURL=urlSt.substring(urlSt.lastIndexOf('/')+1,urlSt.length());    
%>
<%
     response.setContentType("application/x-java-jnlp-file");  
     response.setHeader("Cache-Control", null);
     response.setHeader("Set-Cookie", null);
     response.setHeader("Vary", null);
%>

3.  Customize the <jnlp> tag codebase param to include your parameters.  As you can see, I am using two example parameters :
   <jnlp codebase="<%=request.getScheme() + "://"+ request.getServerName() + ":" + request.getServerPort()+ request.getContextPath() + "/" %>" href="jnlp/launch.jnlp?url=<%=request.getParameter("url")%>&user=<%=request.getParameter("user")%>">


4. Add your arguments again to the application description tag inside your JNLP file (now JSP):
<application-desc main-class="com.ibm.myapp.SomeGreatApp">
       <argument><%=request.getParameter("url")%></argument>
       <argument><%=request.getParameter("user")%></argument>
    </application-desc>

5. Deploy your app, then you can access the JNLP file using a request such as:
http://localhost:8080/mayapp/jnlp/launch.jsp?url=http://localhost:8080/someURL/somepage.html&user=some_username

That way, the params you send using the GET request, will be passed to the application started via Web Start.

luni, 20 ianuarie 2014

Big Data Analytics in Cloud - Patterns and Use-Cases

Introduction


Nowadays, people have implemented several Big Data Analytics solutions already, and through these, a few macro patterns have emerged.  You can find them below, along with a short explanation for each of them.

I also mentioned a few implementation possibilities next to them.   Of course, there are other options as well to each of my examples (I decided to mostly refer IBM software).

1. Landing zone warehouse (HDFS -> ETL DW)


This is composed of a landing zone(Big Data Warehouse), handled using InfoSphere BigInsights (based on Hadoop), that reads the data from various sources and stores it on HDFS.  This could be done through ETL batch processes.  This data is unstructured/semi structured or structured.  It generally needs to be processed and organized.

From there, data can be loaded into a Big Data Report Mart, via batch ETL.  The main advantage is that the data becomes structured and organized according to customer needs.  This could be Cognos, Netezza, DB2 etc.

It could then be queried though SQL queries or by making usage of reporting features of the above tools.


2. Streams dynamic warehouse (RT Filter -> HDFS/DW)


The data in this pattern is organized mostly the same as the one above.  It too has a Landing Zone (Big Data Warehouse) handled with InfoSphere BigInsights for instance, and a Big Data Summary Mart (handled with IBM PureData For Analytics for instance).  From this Summary mart, reports could be extracted using SQL for instance.

The main difference is that the incoming "data stream" into the cloud is going to be initially processed using InfoSphere Streams on real time. It will be filtered and analysed, then stored in the Big Data Warehouse (landing zone). This data is going to be mostly structured.  A part of it is going to be stored directly into the Big Data Summary mart.

The upside in this case is that while the data mart is going to contain structured information as well, the processing happens in real time as the data comes, with the help of InfoSphere Streams.  Therefore the Data Warehouse will contain real time structured data.


3. Streams detail with update (combination 2 & 1)


This is a combination between the first two.  The data streams are going to be real time processed as they come, using InfoSphere Streams.  They will be then stored on both Landing Zone and Big Data Summary mart as in #2.  Additionally, there will be also a Detail Data Mart, loaded through ETL processes from the Landing Zone (Big Data Warehouse).  This way additional processing could be done on this data which requires analysis over large data sets for instance, or maybe by making use of data that is not available in real time.

The applications will access both the Big Data Summary mart loaded with data processed real time, as well as the Detail Data mart, filled by the ETL batch processes.


4. Direct augmentation (HDFS -> augment -> DW)


The incoming data will be loaded in the a Big Data Warehouse (Landing Zone) via batch ETL. Contains mostly unstructured data. It could be then directly accessed though Hive (via a Virtual database) from HDFS.

Additionally, there is also an existing Data mart containing existing transactional data acquired previously, from other sources (internal perhaps).

Applications will make use of both landing zone and data mart simultaneously.


5. Warehouse augmentation (HDFS -> augment analytics)


This is basically the same as #1, with the addition that the data acquired in the Big Data Warehouse is going to be subsequently enhanced using an Analytics Engine.  It will be then loaded to the Summary Data mart, using the same ETL batch processes.


6. Streams augmentation (augment & filter RT -> HDFS/DW)


This patterns also makes use of the Analytics Engine, that will enhance/augment the data coming to the InfoSphere Streams (and processed in real time).  From there, the filtered data will be saved to the Big Data warehouse, handled by InfoSphere BigInsights.   The ETL batch processes will read and process this data, and will load it to the Data Summary Mart.  It can be then accessed though SQL queries for instance.


7. Dynamic cube (HDFS - > search indexes)


This is the most complex pattern of all :).  The data virtually arrives in the Big Data Warehouse handled by InfoSphere BigInsights for example.

It is subsequently being processed by an index crawler, indexed by making usage of a big data index, and then accessed through a Virtual Data Mart.

They say is that though this pattern you're building your own google (search engine) ;)



Primary Big Data Use-Cases


1. Big Data Exploration - find, visualize and understand the data to improve decision making.

2. Enhanced 360 degree view of the Customer - all customer data in one place by incorporating all sources.

3. Security/Intelligence Extension - monitor and detect in real time.

4. Operations Analysis - analyze a variety of machine and operation data for improved business results.

5. Data Warehouse Augmentation - integrate big data and data warehouse capabilities for improved business results.  Optimize your data warehouse to enable new types of analysis.

joi, 16 ianuarie 2014

A few words on Clouds - Introduction

2013 was the year where the clients have embraced the cloud solutions, and several cloud patterns have crystallized and were validated through successful clients engagements.

There are several cloud solutions offerings, including open source.  Examples come from Google, Microsoft, Amazon, IBM, Rackspace, open source (OpenStack) and the list continues...

What are the benefits of using a cloud?  It's a way to externalize the tasks/jobs of hardware/software administration, maintenance and support by delegating these to other entities along with all the requirements that need to be solved through these activities (hosting, provisioning, availability, security, scalability, monitoring, etc).  It also might provide access to third party software using SOA.

On cloud usage patterns, a few approaches have been explored.  One could have public clouds where cloud services are being offered to external clients, or private clouds used by singular entities.

The public clouds have been diversifying their offerings, and a few usage patterns have emerged:

- IaaS or Infrastructure as a Service.  It's basically a way to make computing power available to clients.   One common way to achieve this is through virtualization software, where the clients get access to virtual images where they could install/configure software of their choice.  Depending on the business case, this could be actually cheaper for clients because they are freed from buying and administering the hardware and OSes.  Through agreed SLAs, the clients usually know what to expect regarding the services they receive.

- PaaS or Platform as a Service.  The provider delivers a computing platform, where programming language execution environments, web servers and databases are available.  This builds on top of IaaS by providing the additional software required to build and run applications.

- SaaS or Software as a Service.  In addition to PaaS services, the clients are provided access to application software and services as well.   This might be in the form of SOA or other related services, and/or perhaps third party software such as Microsoft Office, Exchange and others.  The clients are able to deploy they applications to the cloud, make use of cloud infrastructure and platform, deployed software, as well as any SOA related end points made available by the cloud provider.

We will talk about Patterns for Big Data & Analytics in Cloud next

joi, 9 ianuarie 2014

Why we need Architects

Nowadays one could see numerous attempts to break down and conquer the complexities of software development through processes, roles, standards and tasks;  while I think this is certainly a valid approach, I also believe that it's no silver bullet.

No matter how smartly you're going to design your processes, tasks and roles, your weak link is going to be your employees nevertheless.  Therefore, it makes sense to invest in some people to attempt to strengthen this link.  You need experienced and capable people, and with the right attitude.  In addition to that, I also believe that there are a few key roles that most IT projects need and these are: PM, Architect, Team Lead, Developer and Tester (as you all know, there are other roles as well, but not important for the sake of our argument here).  Of course, sometimes, in the real life, some people will have more that one role.

While the PM needs to make sure the project is within time and budget, TL's job is to coordinate the development team through the daily tasks and do development of its own, the Architect's job is to make sure the project goals are being met technically.

It is the Architect's job that there is a clear understanding with the client on what needs to be built, to go to the drawing board and make a technical plan on agreement with the client (gathering feedback from everyone along the way), and to make sure that the artifacts delivered at the end of the project meet those objectives.

There is a wide agreement nowadays on the fact that most of the IT project fail.  Failure means in this context that most projects go actually over time and/or budget.  The most important reason for this is that the teams involved fail to meet the technical challenges along the way, being IT requirements, designing the solution, estimations, development, testing and so on.  This happens most of the time not because they might not have the desire to do the job, but because building computer systems is hard :).

That is why you need an experienced technical person in your project that is concerned mainly with the overall technical goals of the project (Architect/Technical Manager).  This includes making sure that the requirements are in line with client interests and plans, that they are sound, achievable, clear and non conflicting.  It also includes the technical solution design and plan, agreed with the client.  It includes the proactiveness in identifying and taking action to address the technical risks that may appear in the project, as well as making sure that the project  deliverables are in line with the technical solution plan.

It only makes sense that the presence of Architects in those teams would improve the chances of success for projects.  And as we all know, success means happier clients, more business from them and more money for us.

More on this subject later, have fun

miercuri, 8 ianuarie 2014

HTML documentation for your REST services

Hello All

New year new me thing :).  I decided to start sharing some of the more interesting things I am doing so that they remain somewhere, plus that I could use them myself afterwards if the need arises.

So, here it goes.

I was looking into a way to document the REST services these days.   While there are some worthy tools such as enunciate, swagger and others, I found a more simple and effective way to do it.


We're using Jersey ATM, and this post refers to configuring Jersey in this regard.

In short the trick is to extend the generated WADL to contain the documentation (javadoc) taken from your service java classes.

Here's how it goes in a few short steps:

1. Document your services and request/responses (javadoc style)
EX:
    /**
     * Use this method to subscribe fx market rates from the price feed.
     * @param tradingBranchId branch ID
     * @param userId branch ID 2
     * @param customerId 33
     * @response.representation.200.qname {http://www.example.com}item
     * @response.representation.200.mediaType application/xml
     * @response.representation.200.doc This is the representation returned by default (if we have an even number of millis since 1970...:)
     * @response.representation.503.mediaType text/plain
     * @response.representation.503.example You'll get some explanation why this service is not available
     * @response.representation.503.doc You'll get some explanation why this service is not available - return
     * @return a JAX-RS Response containing the pricing information
     */
    @GET
    @Consumes({ MediaType.TEXT_HTML, MediaType.APPLICATION_XML, MediaType.APPLICATION_JSON })
    @Produces({ MediaType.TEXT_HTML, MediaType.APPLICATION_XML, MediaType.APPLICATION_JSON })
    @Path("/subscribe/branches/{trading-branch-id}/cust/{cust-id}/user/{user-id}/")
    public Response subscribeFXLimitOrder(
                    @PathParam("trading-branch-id") String tradingBranchId,
                    @PathParam("userId") String userId,
                    @PathParam("cust-id") String customerId,
                    @QueryParam("channel-id") String channelId,
                    @Context Request request,
                    @Context UriInfo uriInfo)
    { ...

2. Use the Jersey build doclet features to generate documentation file.
EX:
 <target name="generateRestDoc" description="Generates a resource desc file with the javadoc from resource classes">
  <antelope:call target="createPathReferences" />
  <javadoc classpathref="lbbw.service_api.alldeps.jars.path.id">
      <fileset dir="${lbbw.service_api.rest.dir}/src/java" defaultexcludes="yes">
          <include name="com/ibm/vd/otr/tradingapi/rest/resources/**" />
      </fileset>
      <doclet name="org.glassfish.jersey.wadl.doclet.ResourceDoclet" pathref="lbbw.service_api.alldeps.jars.path.id">
          <param name="-output" value="${lbbw.service_api.rest.dir}/${class.dir}/resourcedoc.xml" />
      </doclet>
  </javadoc>
 </target>

3. Configure the Jersey framework to generate an extended WADL to include the documentation file generated at build time.  Details about Jersey config:
Create a custom Jersey custom WADL generator config class, then add it to Jersey in the same manner with the rest of the services.  Make sure you have the 3 XML files in the classpath (they are going to be used by the Jersey custom generator, further below). 

application-doc.xml example:
<applicationDocs targetNamespace="http://wadl.dev.java.net/2009/02">

    <doc xml:lang="en" title="The doc for your API">
        This is a paragraph that is added to the start of the generated application.wadl
    </doc>

    <doc xml:lang="de" title="Die Dokumentation fuer Ihre API">
        Dies in ein Abschnitt, der dem Beginn der generierte application.wadl hinzugefugt wird - in deutscher Sprache.
    </doc>
</applicationDocs>

application-grammars.xml example:
<grammars xmlns="http://wadl.dev.java.net/2009/02"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xi="http://www.w3.org/1999/XML/xinclude">
<include href="schema.xsd" />
</grammars>

resource-doc.xml is generated by the doclet ant task above.

Custom Jersey WADL generator class example:
public class DocWadlGeneratorConfig extends WadlGeneratorConfig
{

    @Override
    public List<WadlGeneratorDescription> configure()
    {
                List<WadlGeneratorDescription> someList = generator(WadlGeneratorApplicationDoc.class)
                        .prop("applicationDocsStream", "application-doc.xml")
                        .generator(WadlGeneratorGrammarsSupport.class)
                        .prop("grammarsStream", "application-grammars.xml")
                        .prop("overrideGrammars", true)
                        .generator(WadlGeneratorResourceDocSupport.class)
                        .prop("resourceDocStream", "resourcedoc.xml")
                        .descriptions();
        
                return someList;
    }
}

Add the config for this class to the Jersey servlet in the web.xml file:
<init-param>
   <param-name>jersey.config.server.wadl.generatorConfig</param-name>
   <param-value>com.ibm.vd.otr.tradingapi.rest.DocWadlGeneratorConfig</param-value>
</init-param>


At this point, if you deploy and hit: http://localhost:9080/myapp/application.xml, you will see an extended WADL containing the documentation taken from your REST service java files.

EX of WADL file:



4. In order to see a pretty HTML with documentation in addition to the extended WADL xml file, you need to do the following:
- Download an XSL styleseet transformation file, one example is this: https://github.com/ipcsystems/wadl-stylesheet

- Configure Jersey to apply this XSL to the extended WADL it generates, which was configured in the previous steps:
Create a Jersey WADL resource generator class, used to direct the browser to apply the new XSL transformation to the extended WADL file, by inserting the preprocessing directive <?xml-stylesheet type="text/xsl" href="statics/wadl.xsl"?> (note that I am using the application2.wadl url in order to have both WADL files, the one without pretty styles and the one with styles available at the same time):

@Produces({ "application/vnd.sun.wadl+xml", "application/xml" })
@Singleton
@Path("application2.wadl")
public final class WadlResource
{

    private WadlApplicationContext wadlContext;

    private Application application;

    private byte[] wadlXmlRepresentation;

    public WadlResource(@Context WadlApplicationContext wadlContext)
    {
        this.wadlContext = wadlContext;
    }

    @GET
    public synchronized Response getWadl(@Context UriInfo uriInfo)
    {
        if (wadlXmlRepresentation == null)
        {
            // String styleSheetUrl = uriInfo.getBaseUri().toString() + "wadl.xsl";
            this.application = wadlContext.getApplication(uriInfo).getApplication();
            try
            {
                Marshaller marshaller = wadlContext.getJAXBContext().createMarshaller();
                marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
                ByteArrayOutputStream os = new ByteArrayOutputStream();

                Writer writer = new OutputStreamWriter(os);
                writer.write("<?xml version='1.0'?>\n");
                writer.write("<?xml-stylesheet type=\"text/xsl\" href=\"statics/wadl.xsl\"?>\n");

                marshaller.setProperty(Marshaller.JAXB_FRAGMENT, Boolean.TRUE);
                marshaller.marshal(application, writer);

                writer.flush();
                writer.close();
                wadlXmlRepresentation = os.toByteArray();
                os.close();
            }
            catch (Exception e)
            {
                e.printStackTrace();
                return Response.ok(application).build();
            }
        }

        return Response.ok(new ByteArrayInputStream(wadlXmlRepresentation)).build();
    }
}
You configure this class in Jersey in the same manner as the rest of your services.  Also make sure the XSL file you downloaded is accessible at the URL indicated in the WADL.  This is because the browser itself will apply the XSL transformation to the WADL, so it needs to locate the XSL file used for it, as directed by the XML WADL file).

Now if you deploy and hit http://localhost:9080/myapp/application2.xml you should see something like this (this is just a fragment):


5. Congrats! Now you have a way to expose in a nice way your REST API javadoc to others :)