tag:blogger.com,1999:blog-57014157907597555712024-03-10T10:07:02.579+01:00Niklas' BlogNiklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.comBlogger40125tag:blogger.com,1999:blog-5701415790759755571.post-36847257369497126002019-03-01T16:09:00.001+01:002019-03-01T16:09:09.253+01:00Setting up MongoDB for bi-temporal dataSee this tutorial: <a href="http://www.projectbarbel.org/docs/mongotutorial">http://www.projectbarbel.org/docs/mongotutorial</a>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-80187960733213690642019-02-27T22:33:00.001+01:002019-02-27T22:34:58.331+01:00Manage bitempotal data with BarbelHisto The last 15 years I've worked on projects for insurance businesses, implementing a variety of policy management systems. The topic that has always been a major requirement was to store the policies and their changes in a way that is traceable for audits and customer claims. Implementing bullet-proof bitemporal data storage has always taken a reasonable amount of time (and nerves). Every time we've implemented a new policy management system we have been on the lookout for a reusable component for bitemporal data. The few options we found did not really satisfy our needs, or had too many technical constraints. For that reason I've decided to implement my own open source library that I'd like to share with you guys: <a href="http://www.projectbarbel.org/">BarbelHisto</a>. With this lightweight library I want to address that bitemporal data storage requirement, without any bothersome constraints. Just managing bitemporal data, that's it. No technology backpack. Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-68776902425800052222019-01-30T15:23:00.002+01:002019-02-05T19:07:37.153+01:00Modern State Pattern using Enums and Functional Interfaces <p>It’s offen the case that the behaviour of an object should change depending on the objects state. Consider a <code>ShoppingBasket</code> object. You can add articles to your basket as long the order isn’t submitted. But once it’s submitted you typically don’t want to be able to change that order anymore. So, there are two states in such a shopping basket object: <code>READONLY</code> and <code>UPDATEABLE</code> . <a name='more'></a> Here is the <code>ShoppingBasket</code> class.</p>
<script src="https://gist.github.com/nschlimm/13fe09e5d9decd8201edf4b278e97e14.js"></script>
<p>In such a class, you can add articles and maybe you can perform an order. Once you’ve performed an order, the client of such an object would still be able to change that order object which should not be possible. To prevent clients from updating the order, that was already submitted, we want to change the behaviour of the <code>ShoppingBasket</code> . It should not be possible to add articles or change the <code>orderNo</code> field, once the order is submitted. What’s an intelligent object-oriented modern Java solution to such a problem? What I usually do in such cases, I use an <code>enum</code> to implement a GoF state pattern. Here is such an <code>enum</code> :</p>
<script src="https://gist.github.com/nschlimm/782eee30034d2330c4c3412d17a8f383.js"></script>
<p>My <code>UpdateState</code> enum takes an <code>Runnable</code> object as constructor argument. You can use more complicated functional interfaces to suit specific needs, the sky is your limit in terms of complexity here. But for now, its an ordinary <code>Runnable</code> interface. The <code>UpdateState</code> enum has exactly two states: <code>UPDATEABLE</code> and <code>READONLY</code> . The <code>UPDATEABLE</code> enum value does validate to true, always, the <code>READONLY</code> enum value always evaluates to false, which results in an <code>InvalidStateException</code> (using Apache Commons Lang <code>Validate</code> class). The <code>UpdateState</code> enum has a method called <code>set()</code> which takes an argument, and returns exactly that given argument. But before returning the argument, the <code>set()</code> method runs the state dependend <code>Runnable</code> action. Now, why all that hassle?</p>
<script src="https://gist.github.com/nschlimm/94ce85f9cc553b2ae9546031162504b9.js"></script>
<p>The <code>ShoppingBasket</code> now has a state field of enum type <code>UpdateState</code> . That state field defaults to <code>UPDATEABLE</code> cause when you create the <code>ShoppingBasket</code> it’s always updateable, meaning: the order wasn’t submitted yet. When you fire the order through the <code>order()</code> method, the state changes to <code>READONLY</code> . Since the state changed to read-only, the <code>ShoppingBasket</code> will change its behaviour, specifically when clients try to access the class fields. Let’s look at the <code>setOrderNo()</code> method for instance. The <code>setOrder()</code> method does not assign the order number directly to the <code>orderNo</code> field anymore, instead it calls the <code>UpdateState</code> enums <code>set()</code> method, which returns that given value you want to set. That return value is assigned to your <code>orderNo</code> field. The <code>set()</code> method of the <code>UpdateState</code> enum always checks whether updates are allowed. So when your <code>ShoppingBasket</code>s state is <code>UPDATEABLE</code>, the <code>set()</code> method will succeed, but when its <code>READONLY</code> then the <code>set()</code> method of that state will result in <code>IllegalStateException</code>. This was exactly what we’ve wanted to achieve in the beginning, make that object read-only, if the order is submitted.</p>
<p>Notice that you can make such a state pattern implementation as complex as required. It’s a very elegant, short option to drive your objects behaviour by objects state. And it saves you a lot if-else typing non-object-oriented logic in all the accessor methods. Consider classes that have 20 fields, you don’t want to check state each time in all the methods. That would clearly clutter up your class code. Using the demonstrated state pattern, you save lines of code and your place looks quite tidy. Change the functional interface used in the <code>UpdateState</code> enum and you’ll realize the great potential of state dependend behaviour that can be implemented with very little lines of code.</p>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-83628798732870101012019-01-29T16:18:00.002+01:002019-02-05T19:06:28.796+01:00Passing multiple arguments into stream filter predicates<p>When I am working with java streams I am intensively using filters to find objects. I offen have the situation where I'd like to pass two arguments to the fiter function. Unfortunately the standard API does only accept a <code>Predicate</code> but not <code>BiPredicate</code>.</p>
<p>To solve this limitation I define all my predicates as methods in a class, say <code>Predicates</code>. That predicate class takes a constant parameter.</p>
<a name='more'></a>
<script src="https://gist.github.com/nschlimm/a1931f2fe787c7f03a4b17d023373ff4.js"></script>
<p>When I am using the <code>Predicates</code> I instintiate an instance with the constant parameter of my choice. Then I can use the instance methods as method references passed to the filter. Like so:</p>
<script src="https://gist.github.com/nschlimm/e3c45256b9eb6130866089219a56ffb0.js"></script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-27698443060142984142012-08-17T13:16:00.006+02:002012-08-21T11:11:15.876+02:005' on IT-Architecture: the modern software architectBefore I start writing about this let me adjust something right at the beginning:<br />
<blockquote style="background-color: #cfe2f3;">Yes of course, there is the role of a "software architect" in any non-trivial software development project. Even in times of agile projects, dynamic markets and vague terms like "emergence". The simple reason for that is that emergence and democracy in teams only work within constraints. Though, it's not always clever to assign somebody the role explicitly. In an ideal world one developer in that team evolves into the architecture role. </blockquote>When I started working as an IT professional at a *big* american software & IT consulting company I spent around five years with programming. After that time I got my first architecture job on a big project at a german automotive manufacturer. My main responsibility was to design the solution, advice developers, project managers and clients in doing things and to organize the development process. I wrote many documents, but I didn't code anymore. The result was that I lost expertise in my <i>core business</i>: programming. So after a while my assessments and gut instinct got worse, which results in worse decisions. As a sideeffect of generic (vague) talking it got harder to gain acceptance by the developers, project managers or clients. When I realized all that I decided to do more development again. Today, I am doing architecture for 10 years. I am developing code in the IDE of my choice at least 20-30% of my time. <br />
<br />
<a name='more'></a><b>Avtivity profile</b><br />
<br />
Whilst programming is a <i>necessary activity</i>, there is a whole bunch of activities that are <i>sufficient</i> to be successful as an architect. Doing architecture is a lot about collaboration, evaluating alternatives objectively (neutral and fair-minded) and about decision making. It's a lot about communication, dealing with other individuals that almost always have their own opinions. Further more it's a lot about forming teams and designing the ideal development process around those teams to solve the concrete problem. Last not least it's about designing (structuring) the solution in a way that all functional and non-functional requirements are well covered. You can do all that more or less without super actual technical knowledge. But I believe an architect can do better if he/she has technical expertise gathered by day-to-day coding business. In the long run you cannot be a technical architect without sufficient coding practice.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLQCtzNf-I5BVqEBeGtJaG5YL4xU7i1KrLR1j6mb_1fCZaxXcRaHYOPp-8PoEl-z_AVeRtL0x_8-vsVXvuqmE5Kk_0EriYj_PpHlWF6XIhJWt4pFArVbFnDThYgQsEY5A0idR_efKNgWs/s1600/Foto.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLQCtzNf-I5BVqEBeGtJaG5YL4xU7i1KrLR1j6mb_1fCZaxXcRaHYOPp-8PoEl-z_AVeRtL0x_8-vsVXvuqmE5Kk_0EriYj_PpHlWF6XIhJWt4pFArVbFnDThYgQsEY5A0idR_efKNgWs/s320/Foto.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Figure 1: Activities of the software architect</td></tr>
</tbody></table><br />
<b>Solving tradeoffs</b><br />
<br />
When I worked as an architect I often found myself in difficult <i>tradeoff situations</i>. That is, I wanted to improve one quality attribute, but to achieve that I needed to downgrade another. Here is a simple but very common example: its often desireable to have a highly changeable system with best possible performance. However, these two attributes - performance and changeability - typically correlate negatively, when you want to increase changeability you often loose efficiency. Doing architecture often means to find the golden mean between competing system qualities - it means choosing the right alternative that represents the best compromise. It's about finding the balance between system qualities and the environmental factors of that system (e.g. steakholders, requirements). The operations manager will focus on the efficiency of a new system, while the development manager will argue that it's important to have a changeable system that generates little maintenance costs. The client wants to have a new system with the highest degree of business process automation as possible. These situations consume a reasonalbe amount of time and energy. <br />
<br />
<b>Sharing knowledge and communication</b><br />
<br />
Another superior important activity: <i>sharing knowledge</i> in a team of technical experts and other steakholders. The core problem of software development is to transform fuzzy knowledge of domain experts into merciless logical machine code of silly computers that only understand two digits: 0 and 1. This is a long way through the venturesome and endless jungle of human misunderstandings! Therefore, architects communicate a lot. They use models to do that. Models serve as a mapping mechanism between human brains and computers. The set of problems that can arise during the knowledge-to-binary transformation is very diverse. It's impossible that every team member knows all of them. That's another reason why sharing knowledge in a team is so superior important.<br />
<br />
<b>Nobody is perfect!</b><br />
<br />
Needless to say that <i>nobody is perfect</i>. Every team is different and so is every concrete situation. So in one situation one may be the right architect for the team while in other team set-ups that person doesn't fit. An architect can also have different strengths. I know architects that communicate and socialize very well but don't do so good in designing solutions or organizing the development process. Although they don't master each individual skill, they're all good architects. The common ground is that they were all down-to-earth developers.<br />
<br />
That's all I wanted to express today. <br />
So long, NiklasNiklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com2tag:blogger.com,1999:blog-5701415790759755571.post-88656971168247123972012-07-18T16:59:00.003+02:002012-07-22T17:38:30.179+02:005' on IT-Architecture: root concepts explained by the pioneers of software architectureThe last couple of weeks I am working on a new software architecture course specifically for the insurance and financial sector. During the preparations I was reading many of the most cited articles on software architecture. The concepts described in these articles are so fundamental (and still up-to-date) that every architect really should know about them. I have enjoyed reading such "old" stuff. I first read most of the cited articles during my studies at university in the mid 90s. It is surprising to realize that, the longer you're in this business, the more you agree to the ideas explained - in articles that were written 40 years ago! I've decided to qoute the original text passages - may be I thought it would be overbearing to explain it in my own words ;-) I hope you enjoy reading these text passages from the pioneers of software architecture.<br />
<br />
<a name='more'></a><b>On the criteria for system decomposition</b><br />
<br />
"Many readers will now see what criteria were used in each decomposition. In the first decomposition the criterion used was to make each major step in the processing a module. One might say that to get the first decomposition one makes a flowchart. This is the most common approach to decomposition or modularization. It is an outgrowth of all programmer training which teaches us that we should begin with a rough flowchart and move from there to a detailed implementation. The flowchart was a useful abstraction for systems with on the order of 5,000-10,000 instructions, but as we move beyond that it does not appear to be sufficient; something additional is needed.<br />
<br />
The second decomposition was made using "information hiding" as a criterion. The modules no longer correspond to steps in the processing. [...] Every module in the second decomposition is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings."<br />
<br />
in: On the Criteria To Be Used in Decomposing Systems into Modules, D.L. Parnas, 1972<br />
<br />
<b>On the information hiding design principle</b><br />
<br />
"Our module structure is based on the decomposition criterion known as information hiding [IH]. According to this principle, system details that are likely to change independently should be the secrets of separate modules; the only assumptions that should appear in the interfaces between modules are those that are considered unlikely to change. Each data structure is used in only one module; it may be directly accessed by one or more programs within the module but not by programs outside the module. Any other program that requires information stored in a module’s data structures must obtain it by calling access programs belonging to that module.<br />
<br />
Applying this principle is not always easy. It is an attempt to minimize the expected cost of software and requires that the designer estimate the likelihood of changes. Such estimates are based on past experience, and may require knowledge of the application area, as well as an understanding of hardware and software technology."<br />
<br />
in: The Modular Structure of Complex Systems, D.L. Parnas, 1985<br />
<br />
<b>On module hierarchies</b><br />
<br />
"In discussions of system structure it is easy to confuse the benefits of a good decomposition with those of a hierarchical structure. We have a hierarchical structure if a certain relation may be defined between the modules or programs and that relation is a partial ordering. The relation we are concerned with is "uses" or "depends upon". [...] The partial ordering gives us two additional benefits. First, parts of the system are benefited (simplified) because they use the services of lower levels. Second, we are able to cut off the upper levels and still have a usable and useful product. [...] The existence of the hierarchical structure assures us that we can "prune" off the upper levels of the tree and start a new tree on the old trunk. If we had designed a system in which the "low level" modules made some use of the "high level" modules, we would not have the hierarchy, we would find it much harder to remove portions of the system, and "level" would not have much meaning in the system."<br />
<br />
in: On the Criteria To Be Used in Decomposing Systems into Modules, D.L. Parnas, 1972<br />
<br />
<b>On the separation of concerns</b><br />
<br />
"Let me try to explain to you, what to my taste is characteristic for all intelligent thinking. It is, that one is willing to study in depth an aspect of one's subject matter in isolation for the sake of its own consistency, all the time knowing that one is occupying oneself only with one of the aspects. We know that a program must be correct and we can study it from that viewpoint only; we also know that it should be efficient and we can study its efficiency on another day, so to speak. In another mood we may ask ourselves whether, and if so: why, the program is desirable. But nothing is gained —on the contrary!— by tackling these various aspects simultaneously. It is what I sometimes have called "the separation of concerns", which, even if not perfectly possible, is yet the only available technique for effective ordering of one's thoughts, that I know of. This is what I mean by "focussing one's attention upon some aspect": it does not mean ignoring the other aspects, it is just doing justice to the fact that from this aspect's point of view, the other is irrelevant. It is being one- and multiple-track minded simultaneously."<br />
<br />
in: On the role of scientific thought, Edsger W. Dijkstra, 1974<br />
<br />
<b>On conceptual integrity</b><br />
<br />
"Such design coherence in a tool not only delights, it also yields ease of learning and ease of use. The tool does what one expects it to do. I argued [...] that conceptual integrity is the most important consideration in system design. Sometimes the virtue is called coherence, sometimes consistency, sometimes uniformity of style [...] The solo designer or artist usually produces works with this integrity subconsciously; he tends to make each microsdecision the same way each time he encounters it (barring strong reasons). If he fails to produce such integrity, we consider the work flawed, not great."<br />
<br />
in: The Design of Design, Frederick P. Brooks, 2010 (originally introduced in: The Mythical Man Month, 1975)Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-64145341017803302772012-06-25T15:39:00.003+02:002012-07-19T08:19:00.390+02:005' on IT-Architecture: four laws of robust software systems<a href="http://www.murphys-laws.com/murphy/murphy-true.html">Murphy's Law</a> ("If anything can go wrong, it will") was born at Edwards Air Force Base in 1949 at North Base. It was named after Capt. Edward A. Murphy, an engineer working on Air Force Project MX981, (a project) designed to see how much sudden deceleration a person can stand in a crash. One day, after finding that a transducer was wired wrong, he cursed the technician responsible and said, "If there is any way to do it wrong, he'll find it." <br />
<br />
<a name='more'></a>For that described reason it may be good to put some quality assurance process in place. I could also call this blog "the four laws of steady software quality". It's about some fundamental techniques that can help to achieve superior quality over a longer distance. This is particularly important if you're developing some central component that will cause serious damage if it fails in production. OK, here is my (never final and not holistic) list of practical quality assurance tipps.<br />
<br />
Law 1: facilitate change<br />
<br />
There is nothing permanent except change. If a system isn't designed in accordance to this superior important reality, then the probability of failure may increase above average. A widely used technique to facilitate change is the development of a sufficient set of <a href="http://en.m.wikipedia.org/wiki/Unit_testing">unit tests</a>. Unit testing enables to uncover regressions in existing functionality after changes have been made to a system. It also encourages to really think about the desired functionality and required design of the component under development.<br />
<br />
Law 2: don't rush through the functional testing phase<br />
<br />
In economics, the marginal utility of a good is the gain (or loss) from an increase (or decrease) in the consumption of that good. The law of diminishing marginal utility says, that the marginal utility of each (homogenous) unit decreases as the supply of units increases (and vice versa). The first <a href="http://en.wikipedia.org/wiki/Functional_testing">functional test</a> cases often walk through the main scenarios covering the main paths of the considered software. All the code tested wasn't executed before. These test cases have a very high marginal utility. Subsequent test cases may walk through the same code ranges except specific sidepaths at specific validation conditions for instance. These test cases may cover three or four additional lines of code in your application. As a result, they will have a smaller marginal utility then the first test cases. <br />
<br />
My law about functional testing suggests: as long the execution of the next test case yields a significant utility the following applies: the more time you invest into testing the better the outcome! So don't rush through a functional testing phase and miss out some useful test case (this assumes the special case in which usefulness can be quantified). Try to find the useful test cases that promise a significant gain in perceptible quality. On the other hand, if you're executing test cases with a negative marginal utility you're actually investing more effort then you gain in terms of perceptible quality. There is this special (but not uncommon) situation where the client does not run functional tests on systematic bases. This law will then suggest: the longer the application is in the test environment, the better the outcome. <br />
<br />
Law 3: run (non-functional) benchmark tests<br />
<br />
Another peace of good permanent software quality is a regular <a href="http://en.wikipedia.org/wiki/Load_testing">load test</a>. To make results usable load tests need a defined steady environment and a baseline of measured values (a benchmark). These values are at least: CPU, response time, memory footprint. Load tests of new releases can be compared to those load tests of older releases. That way we can also bypass the often stated requirement that the load test environment needs to have the same capacity parameters then the production environment. In many cases it is possible to see the real big issues with a relatively small set of parallel users (e.g. 50 users). <br />
<br />
It makes limited sense to do load testing if single user <a href="http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29">profiling</a> results are bad. Therefore it's a good idea to perform repeatable profiling test cases with every release. This way profiling results can be compared to each other (again: the benchmark idea). We do CPU and elapsed time profiling as well as memory profiling. Profiling is an activity that runs in parallel to actual development. It makes sence to focus on the main scenarios used regularly in production. <br />
<br />
Law 4: avoid dependency lock-in<br />
<br />
The difference between trouble and severe crisis is the time it takes to fix the problem that causes the trouble. For this reason you may always need a way back to your previous release, you need a fallback scenario to avoid a production crisis with severe business impact. You enable rollback by avoiding dependency lock-in. Runtime-dependencies of your application may exist to neighbouring systems by joint interface or contract changes during development. If you implemented requirements that resulted in changed interfaces and contracts, then you cannot simply roll back, that's obvious. Therefore you need to avoid too many interface and contract changes. Small release cycles help to reduce dependencies between application versions in one release 'cause less changes are rolled to production. Another counteraction against dependency lock-in is to let neighbouring systems be downwoards compatible for one version. <br />
<br />
That's it in terms of robust systems.<br />
Cheers, NiklasNiklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com2tag:blogger.com,1999:blog-5701415790759755571.post-2317585183822012692012-05-15T16:38:00.000+02:002012-05-15T16:38:34.806+02:005' on IT-Architecture: three laws of good software architectureThe issue with architectural decisions is that they effect the whole system and/or you often need to make them early in the development process. It means a lot effort if you change that decision a couple of months later. From an economic standpoint architectural decisions are often irrevocable. Good architecture is one that allows an architect to make late decisions without superior effect on efforts and costs. Let's put that on record.<a name='more'></a><br />
<br />
Law 1: Good architecture is one that enables architects to have a minimum of irrevocable decisions.<br />
<br />
To minimize the set of irrevocable decisions the system needs to be responsive to change. There is a major lesson I have learned about software development projects: Nothing is permanent except change. The client changes his opinion about requirements. The stakeholders change their viewpoint of what's important. People join and leave the project team. The fact that change alone is unchanging leads me to the second rule of good architecture, that is:<br />
<br />
Law 2: To make decisions revocable you need to design for flexibility.<br />
<br />
This is the most provocative statement and I am having controversial discussions here. The reason is that flexibility introduces the need for abstraction. Abstraction uses a strategy of simplification, wherein formerly concrete details are left ambiguous, vague, or undefined (from <a href="http://en.wikipedia.org/wiki/Abstraction">Wikipedia</a>). This simplification process isn't always simple to do and to follow for others in particular. "Making something easy to change makes the overall system a little more complex, and making everything easy to change makes the entire system very complex. Complexity is what makes software hard to change." <a href="http://martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf">from M. Fowler</a>) This is one core problem of building good software architecture: Developing software that is easy to change but at the same time understandable. There are several concepts that try to tackle this paradox problem: <a href="http://en.wikipedia.org/wiki/Design_Patterns">design patterns</a> and <a href="http://www.objectmentor.com/resources/articles/Principles_and_Patterns.pdf">object oriented design principles</a>. Polymorphism, loose coupling and high cohesion are flexibility enablers to me.<br />
<br />
Law 3: To make use of flexibility one needs to refactor mercilessly.<br />
<br />
Flexibility is not an end in itself. You need to actively make use of flexible design. If something is changing and it makes a previous design or architectural decision obsolete you need to go into the code and change the software. Otherwise the effort of building flexible software is useless and technical debt may cause late delays and a maintenance nightmare. The fact that you take rigorous action on your code base requires continuous feedback about the qualities of your software. To be able to refactor it is therefore essential that the code base is covered by a sufficient amount of automated tests. In an ideal scenario everything is integrated into a continuous integration environment to receive permanent feedback about the health of your code base.Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com10tag:blogger.com,1999:blog-5701415790759755571.post-21246930916110430102012-05-04T15:09:00.004+02:002012-05-08T13:12:13.216+02:00Java 7: NIO.2 I/O operations on asynchronous channels are not atomicThis part of my NIO.2 series wasn't on schedule when I started writing about NIO.2 aasynchronous file channels. It deals with an important detail: read and write operations are not atomic. What that means is that <code>AsynchronousFileChannel -> write()</code> does not garantee to write all bytes passed as parameter to the destination file. Instead it returns the number of bytes written as return parameters of the corresponding I/O operations and the client needs to deal with situations where the bytes written isn't equal to the remaining bytes in the passed <code>ByteBuffer</code>. <br />
<br />
<a name='more'></a>Let's recall the method signatures of the <code>read()</code> and <code>write()</code> operations in the <code>AsynchronousFileChannel</code> interface for a moment.<br />
<br />
<pre class="java" name="code">public abstract <A> void write(ByteBuffer src,
long position,
A attachment,
CompletionHandler<Integer,? super A> handler);
public abstract <A> void read(ByteBuffer dst,
long position,
A attachment,
CompletionHandler<Integer,? super A> handler);
</pre><br />
As you can see these signatures offer to pass a completion handler. I've already intruduced the completion handler in my last blog about closing file channels safely. You could also use the completion handler to enforce that all bytes are written or read when you perform I/O operations on an asynchronous channel. Here is the code snippet that does the Job.<br />
<br />
<script src="https://gist.github.com/2506857.js">
</script><br />
The <code>readAll</code> (line 14) and the <code>writeFully</code> methods (line 35) both call the corresponding <code>read</code> or <code>write</code> operations on asynchronous file channels recursively. This recursion ends, when the bytes of the source <code>ByteBuffer</code> are transferred completely. Notice that the main thread has to wait for these recursions to finnish. Therefore a <code>CountDownLatch</code> stops the main thread until all bytes are processed by the I/O thread that executes the <code>CompletionHandler</code>.<br />
<br />
The explained procedure works because the position of the source or destination <code>ByteBuffer</code> is always in sync with the actual bytes transferred. Another important fact is that the <code>write()</code> and <code>read()</code> operations in the <code>CompletionHandler</code> are chained. That is when the write task completes one new one is issued and when this completes one new one is issued and so forth. Allthough different JVM threads will participate, there won't be an issue in sharing that same (non thread safe) <code>ByteBuffer</code> instance.<br />
<br />
The NIO.2 file channels series:<br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part.html">Introduction</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part_05.html">Applying custom thread pools</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-9-nio2-file-channels-on-test.html">Closing file channels without loosing data</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-10-nio2-file-channels-on-test.html">I/O operations are not atomic<br />
</a><br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com1tag:blogger.com,1999:blog-5701415790759755571.post-85506608667799373712012-05-04T15:08:00.002+02:002012-05-08T13:11:54.615+02:00Java 7: Closing NIO.2 file channels without loosing dataClosing an asynchronous file channel can be very difficult. If you submitted I/O tasks to the asynchronous channel you want to be sure that the tasks are executed properly. This can actually be a tricky requirement on asynchronous channels for several reasons. The default channel group uses deamon threads as worker threads, which isn't a good choice, cause these threads just abandon if the JVM exits. If you use a custom thread pool executor with non-deamon threads (see <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part_05.html">last part of this series</a>) you need to manage the lifecycle of your thread pool yourself. If you don't the threads just stay alive when the main thread exits. Hence, the JVM actually does not exit at all, what you can do is kill the JVM. <br />
<br />
<a name='more'></a>Another issue when closing asynchronous channels is mentioned in the javadoc of <code>AsynchronousFileChannel</code>: "Shutting down the executor service while the channel is open results in unspecified behavior." This is because the <code>close()</code> operation on <code>AsynchronousFileChannel</code> issues tasks to the associated executor service that simulate the failure of pending I/O operations (in that same thread pool) with an <code>AsynchronousCloseException</code>. Hence, you'll get <code>RejectedExecutionException</code> if you perform <code>close()</code> on an asynchronous file channel instance when you previously closed the associated executor service. <br />
<br />
That all being said, the proposed way to safely configure the file channel and shutdown that channel goes like this:<br />
<br />
<script src="https://gist.github.com/2137930.js">
</script><br />
The custom thread pool executor service is defined in lines 6 and 7. The file channel is defined in lines 10 to 13. In the lines 18 to 20 the asynchronous channel is closed in an orderly manner. First the channel itself is closed, then the executor service is shutdown and last not least the thread awaits termination of the thread pool executor. <br />
<br />
Although this is a safe way to close a channel with a custom executor service, there's a new issue introduced. The clients submitted asynchronous write tasks (line 16) and may want be sure that, once they've been submitted successfully, those tasks will definitely be executed. Always waiting for <code>Future.get()</code> to return (line 23), isn't an option, cause in many cases this would lead *asynchronous* file channels ad adsurdum. The snippet above will return lot's of "Task wasn't executed!" messages cause the channel is closed immediately after the write operations were submitted to the channel (line 18). To avoid such 'data loss' you can implement your own <code>CompletionHandler</code> and pass that to the requested write operation.<br />
<br />
<script src="https://gist.github.com/2146334.js">
</script><br />
The <code>CompletionHandler.failed()</code> method (line 16) catches any runtime exception during task processing. You can implement any compensation code here to avoid data loss. When you work on mission critical data, then it may be a good idea to use <code>CompletionHandler</code>s. But *still* there's another issue. The clients can submit tasks but they don't know if the pool will successfully process these tasks. Successful in this context means that the bytes submitted actually reach their destination (the file on the hard disk). If you want to be sure that all submitted tasks are actually processed before closing, it gets a little trickier. You need a 'graceful' closing mechanism, that waits until the work queue is empty *before* it actually closes the channel and the associated executor service (this isn't possible using standard lifecycle methods). <br />
<b><br />
Introducing GracefulAsynchronousChannel</b><br />
<br />
My last snippets introduce the <code>GracefulAsynchronousFileChannel</code>. You can get the complete code <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/nio/investigation/closing/graceful/GracefulAsynchronousFileChannel.java">here in my Git repository</a>. The behaviour of that channel is like this: guarantee to process all successfully submitted write operations and throw an <code>NonWritableChannelException</code> if the channel prepares shutdown. It takes two things to implement that behaviour. Firstly, you'll need to implement the <code>afterExecute()</code> in an extension of <code>ThreadPoolExecutor</code> that sends a signal when the queue is empty. This is what <code>DefensiveThreadPoolExecutor</code> does. <br />
<br />
<script src="https://gist.github.com/2301734.js">
</script><br />
The <code>afterExecute()</code> method (line 12) is executed after each processed task by the thread that processed that given task. The implementation sends the <code>isEmpty</code> signal in line 18. The second part you need two gracefully close a channel is a custom implementation of the <code>close()</code> method of <code>AsynchronousFileChannel</code>.<br />
<br />
<script src="https://gist.github.com/2301874.js">
</script><br />
Study that code for a while. The interesting bits are in line 11 where the <code>innerChannel</code> gets replaced by a read-only channel. That causes any subsequent asynchronous write requests to fail with an <code>NonWritableChannelException</code>. In line 16 the <code>close()</code> method waits for the <code>isEmpty</code> signal to happen. When this signal is send after the last write task the <code>close()</code> method continues with an orderly shutdown procedure (line 27 ff.). Basically, the code adds a shared lifecycle state across the file channel and the associated thread pool. That way both objects can communicate during the shutdown procedure and avoid data loss.<br />
<br />
Here is a logging client that uses the <code>GracefulAsynchronousFileChannel</code>.<br />
<br />
<script src="https://gist.github.com/2308688.js">
</script><br />
The client starts two threads, one thread issues write operations in an infinite loop (line 6 ff.). The other thread closes the file channel asynchronously after one second of processing (line 25 ff.). If you run that client, then the following output is produced:<br />
<pre class="java" name="code">Starting graceful shutdown ...
Deal with the fact that the channel was closed asynchronously ... java.nio.channels.NonWritableChannelException
Channel blocked for write access ...
Waiting for signal that queue is empty ...
Issueing signal that queue is empty ...
Received signal that queue is empty ... closing
File closed ...
Pool closed ...
Expected file size (bytes): 400020
Actual file size (bytes): 400020
No write operation was lost!
</pre>The output shows the orderly shutdown procedure of participating threads. The logging thread needs to deal with the fact that the channel was closed asynchronously. After the queued tasks are processed the channel resources are closed. No data was lost, everything that the client issued was really written to the file destination. No <code>AsynchronousClosedException</code>s or <code>RejectedExecutionException</code>s in such a graceful closing procedure.<br />
<br />
That's all in terms of safely closing asynchronous file channels. The complete code <a href="https://github.com/nschlimm/playground/tree/master/java7-playground/src/main/java/com/schlimm/java7/nio/investigation/closing/graceful">is here in my Git repository</a>. I hope you've enjoyed it a little. Looking forward to your comments.<br />
Cheers, Niklas<br />
<br />
The NIO.2 file channels series:<br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part.html">Introduction</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part_05.html">Applying custom thread pools</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-9-nio2-file-channels-on-test.html">Closing file channels without loosing data</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-10-nio2-file-channels-on-test.html">I/O operations are not atomic<br />
</a><br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com0tag:blogger.com,1999:blog-5701415790759755571.post-91861534242896614872012-04-19T14:22:00.001+02:002012-04-20T17:31:33.240+02:00Threading stories: ThreadLocal in web applicationsThis week I spend reasonable time to eliminate all our <code>ThreadLocal</code> variables in our web applications. The reason was that they created classloader leaks and we coudn't undeploy our applications properly anymore. Classloader leaks happen when a GC root keeps referencing an application object after the application was undeployed. If an application object is still referenced after undeploy, then the whole class loader can't be garbage collected cause the considered object references your applications class file which in turn references the classloader. This will cause an <code>OutOfMemoryError</code> after you've undeployed and redeployed a couple of times.<br />
<br />
<a name='more'></a><code>ThreadLocal</code> is one classic candidate that can easily create classloader leaks in web applications. The server is managing its threads in a pool. These threads live longer then your web application. In fact they don't die at all until the underlying JVM dies. Now, if you put a <code>ThreadLocal</code> in a pooled thread that references an object of your class you *must* be careful. You need to make sure that this variable is removed again using <code>ThreadLocal.remove()</code>. The issue in web applications is: where is the right place to safely remove <code>ThreadLocal</code> variables? Also, you may not want to modify that "removing code" every time a colleague decided to add another <code>ThreadLocal</code> to the managed threads. <br />
<br />
We've developed a wrapper class around thread local that keeps all the thread local variables in one single <code>ThreadLocal</code> variable. Here is the code.<br />
<br />
<script src="https://gist.github.com/2234464.js">
</script><br />
The advantage of the utility class is that no developer needs to manage the thread local variable lifecycle individually. The class puts all the thread locals in one map of variables. The <code>destroy()</code> method can be invoked where you can safely remove all thread locals in your web application. In our case thats a <code>ServletRequestListener -> requestDestroyed()</code> method. You will also need to place finally blocks elsewhere. Typical places are near the <code>HttpServlet</code>, in the <code>init()</code>, <code>doPost()</code>, <code>doGet()</code> methods. This may remove all thread locals in the pooled worker threads after the request is done or an exception is thrown unexpectedly. Sometimes it happens that the <code>main</code> thread of the server leaks thread local variables. If that is the case you need to find the right places where to call the <code>ThreadLocalUtil -> destroy()</code> method. To do that figure out where the main thread actually *creates* the thread variables. You could use your debugger to do that.<br />
<br />
Many guys out there suggest to ommit <code>ThreadLocal</code> in web applications for several reasons. It can be very difficult to remove them in a pooled thread environment so that you can undeploy the applications safely. <code>ThreadLocal</code> variables can be useful, but it's fair to consider other techniques before applying them. An alternative for web applications to carry request scope parameters is the <code>HttpServletRequest</code>. Many web frameworks allow for generic request parameter access as well as request/session attribute access, without ties to the native Servlet/Portlet API. Also many framework support request scoped beans to be injected into an object tree using dependency injection. All these options fulfill most requirements and should be considered prior to using <code>ThreadLocal</code>.<br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com4tag:blogger.com,1999:blog-5701415790759755571.post-79974948194327891692012-04-05T09:42:00.009+02:002012-05-08T13:11:34.104+02:00Java 7: NIO.2 File Channels on the test bench - Part 2 - Applying custom thread poolsAsynchronous file processing isn't a green card for high performance. In my <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part.html">last post</a> I have demonstrated that conventional I/O can be faster then asynchronous channels. There are some additional important facts to know when applying NIO.2 file channels. The <code>Iocp</code> class that performs all the asynchronous I/O tasks in NIO.2 file channels is, by default, backed by a so called "cached" thread pool. That's a thread pool that creates new threads as needed, but will reuse previously constructed threads *when* they are available. Look at the code of the <code>ThreadPool</code> class held by the <code>Iocp</code>.<br />
<a name='more'></a><br />
<script src="https://gist.github.com/1950482.js">
</script><br />
The thread pool in the default channel group is constructed as <code>ThreadPoolExecutor</code> with a maximum thread count of Integer.MAX_VALUE and a keep-alive-time of Long.MAX_VALUE. The threads are created as daemon threads by the thread factory. A synchronous hand-over queue is used to trigger thread creation if all threads are busy. There are several issues with this configuration: <br />
<br />
1. If you perform write operations on asynchronous channels in a burst you will create thousands of worker threads which likely results in an <code>OutOfMemoryError: unable to create new native thread</code>. <br />
2. When the JVM exits, then all deamon threads are abandoned - finally blocks are not executed, stacks are not unwound.<br />
<br />
In <a href="http://niklasschlimm.blogspot.de/2012/03/threading-stories-about-robust-thread.html">my other blog</a> I have explained why unbounded thread pools can 'cause trouble. Therefore, if you use asynchronous file channels, it may be an option to use custom thread pools instead of the default thread pool. The following snippet shows an example custom setting.<br />
<br />
<script src="https://gist.github.com/2135620.js">
</script><br />
The javadoc of <code>AsynchronousFileChannel</code> states that the custom executor should "minimally [...] support an unbounded work queue and should not run tasks on the caller thread of the execute method." That's a risky statement, it is only reasonable if resources aren't an issue, which is rarely the case. It may make sense to use bounded thread pools for asynchronous file channels. You cannot get a too-many-threads issue, also you cannot flood your heap with work queue tasks. In the example above you have five threads that execute asynchonous I/O tasks and the work queue has a capacity of 2500 tasks. If the capacity limit is exceeded the rejected-execution-handler implements the <code>CallerRunsPolicy</code> where the client has to execute the write task synchronously. This can (dramatically) slow down the system performance because the workload is "pushed back" to the client and executed synchronously. However, it can also save you from much more severe issues where the result is unpredictable. It's a good practice to work with bounded thread pools and to keep the thread pool sizes configurable, so that you can adjust them at runtime. Again, to learn more about robust thread pool settings <a href="http://niklasschlimm.blogspot.de/2012/03/threading-stories-about-robust-thread.html">see my other blog entry</a>.<br />
<br />
<blockquote class="tr_bq" style="background-color: #cfe2f3;">Thread pools with synchronous hand-over queues and unbound maximum thread pool sizes can aggressively create new threads and thus can seriously harm system stability by consuming (pc registers and java stacks) runtime memory of the JVM. The 'longer' (elapsed time) the asynchronous task, the more likely you'll run into this issue.</blockquote><blockquote class="tr_bq" style="background-color: #cfe2f3;">Thread pools with unbounded work queues and fixed thread pool sizes can aggressively create new tasks and objects and thus can seriously harm system stability by consuming heap memory and CPU through excessive garbage collection activity. The larger (in size) and longer (in elapsed time) the asynchronous task the more likely you'll run into this issue.</blockquote><script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
That's all in terms of applying custom thread pools to asynchronous file channels. My next blog in this series will explain how to close asynchronous channels safely without loosing data.<br />
<br />
The NIO.2 file channels series:<br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part.html">Introduction</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part_05.html">Applying custom thread pools</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-9-nio2-file-channels-on-test.html">Closing file channels without loosing data</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-10-nio2-file-channels-on-test.html">I/O operations are not atomic<br />
</a><br />
<br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com5tag:blogger.com,1999:blog-5701415790759755571.post-40250984929140553392012-04-05T09:40:00.009+02:002012-05-08T13:11:11.642+02:00Java 7: NIO.2 File Channels on the test bench - Part 1 - IntroductionAnother blog post about new JDK 7 features. This time I am writing about the new <code>AnsynchronousFileChannel</code> class. I am analyzing the new JDK 7 features in depth for a couple of weeks now and I have decided to number my posts consecutively. Just to make sure I don't get confused :-) Here is my 7th post about Java 7 (I admit that - by coincidence - this was also a little confusing). Using NIO.2 asynchronous file channels effectively is a wide topic. There are some things to consider here. I have decided to devide the stuff into four posts. In this first part I will introduce the involved concepts when you use asynchonous file channels. Since these file channels work asynchronously, it is interesting to look at their performance compared to conventional I/O. The second part deals with issues like memory and CPU consumption and explains how to use the new NIO.2 channels safely in a high performance scenario. You also need to understand how to close asynchronous channels without loosing data, that's part three. Finally, in part four, we'll take a look into concurrency. <br />
<a name='more'></a><br />
<blockquote class="tr_bq"><div style="background-color: #cfe2f3; text-align: left;">Notice: I won't explain the complete API of asynchronous file channels. There are enough posts out there that do a good job on that. My posts dive more into practical applicability and issues you may have when using asynchronous file channels.</div></blockquote><br />
OK, enough vague talking, let's get started. Here is a code snippet that opens an asynchronous channel (line 7), writes a sequence of bytes to the beginning of the file (line 9) and waits for the result to return (line 10). Finally, in line 14 the channel is closed.<br />
<br />
<script src="https://gist.github.com/1950035.js">
</script><br />
<br />
<b>Important participants in asynchonous file channel calls</b><br />
<br />
Before I continue to dive into the code, let's introduce quickly the involved concepts in the asynchronous (file) channel galaxy. The callgraph in figure 1 shows the sequence diagram in a call to the <code>open()</code>-method of the <code>AsynchronousFileChannel</code> class. A <code>FileSystemProvider</code> encapsulates all the operating systems specifics. To amuse everybody I am using a Windows 7 client when I am writing this. Therefeore a <code>WindowsFileSystemProvider</code> calls the <code>WindowsChannelFactory</code> which actually creates the file and calls the <code>WindowsAsynchronousFileChannelImpl</code> which returns an instance of itself. The most important concept is the <code>Iocp</code>, the I/O completion port. It is an API for performing multiple simultaneous asynchronous input/output operations. A completion port object is created and associated with a number of file handles. When I/O services are requested on the object, completion is indicated by a message queued to the I/O completion port. Other processes requesting I/O services are not notified of completion of the I/O services, but instead check the I/O completion port's message queue to determine the status of its I/O requests. The I/O completion port manages multiple threads and their concurrency. Is you can see from the diagram the <code>Iocp</code> is a subtype of <code>AsynchronousChannelGroup</code>. So in JDK 7 asynchronous channels the asynchronous channel group is implemented as an I/O completion port. It owns the <code>ThreadPool</code> responsible for performing the requested asynchronous I/O operation. The <code>ThreadPool</code> actually encapsulates a <code>ThreadPoolExecutor</code> that does all the multi-threaded asynchronous task execution management since Java 1.5. Write operations to asnchronous file channels result in calls to the <code>ThreadPoolExecutor.execute()</code> method. <br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCc-GVJ0YDZA1754T4VPib8d__D8ldeWVqywCNPPW9uRNvxHveVPzLth45kV6W2NFJdmwL55TXjnYlCAzDKOl7pW-KvJApd4H8RZgpKwYKfYLprdfcwxH7IPLtGcaYaaPkPmilQ0-XeE/s1600/FileChannelCallgraph.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="254" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCc-GVJ0YDZA1754T4VPib8d__D8ldeWVqywCNPPW9uRNvxHveVPzLth45kV6W2NFJdmwL55TXjnYlCAzDKOl7pW-KvJApd4H8RZgpKwYKfYLprdfcwxH7IPLtGcaYaaPkPmilQ0-XeE/s320/FileChannelCallgraph.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Figure 1: Callgraph on open call to asynchronous file channel</td></tr>
</tbody></table><br />
<b>Some benchmarks</b><br />
<br />
It's always interesting to look at the performance. Asynchronous non blocking I/O must be fast, right? To find an answer to that question I have made my benchmark analysis. Again, I am using <a href="http://www.javaspecialists.eu/archive/Issue124.html">Heinz' tiny benchmarking framework</a> to do that. My machine is an Intel Core i5-2310 CPU @ 2.90 GHz with four cores (64-bit). In a benchmark I need a baseline. My baseline is a simple conventional synchronous write operation into an ordinary file. Here is the snippet:<br />
<br />
<script src="https://gist.github.com/1950373.js">
</script><br />
As you can see in line 25, the benchmark performs a single write operation into an ordinary file. And these are the results:<br />
<br />
<pre class="java" name="code">Test: Performance_Benchmark_ConventionalFileAccessExample_1
Warming up ...
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:365947
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:372298
Starting test intervall ...
EPSILON:20:TESTTIME:1000:ACTTIME:1000:LOOPS:364706
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:368309
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:370288
EPSILON:20:TESTTIME:1000:ACTTIME:1001:LOOPS:364908
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:370820
Mean: 367.806,2
Std. Deviation: 2.588,665
Total started thread count: 12
Peak thread count: 6
Deamon thread count: 4
Thread count: 5</pre><br />
The following snippet is another benchmark which also issues a write operation (line 25), this time to an asynchronous file channel:<br />
<br />
<script src="https://gist.github.com/1950310.js">
</script><br />
This is the result of the above benchmark on my machine:<br />
<br />
<pre class="java" name="code">Test: Performance_Benchmark_AsynchronousFileChannel_1
Warming up ...
EPSILON:20:TESTTIME:1000:ACTTIME:1015:LOOPS:42667
EPSILON:20:TESTTIME:1000:ACTTIME:1015:LOOPS:193351
Starting test intervall ...
EPSILON:20:TESTTIME:1000:ACTTIME:1015:LOOPS:191268
EPSILON:20:TESTTIME:1000:ACTTIME:1015:LOOPS:186916
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:189842
EPSILON:20:TESTTIME:1000:ACTTIME:1014:LOOPS:191103
EPSILON:20:TESTTIME:1000:ACTTIME:1015:LOOPS:192005
Mean: 190.226,8
Std. Deviation: 1.795,733
Total started thread count: 17
Peak thread count: 11
Deamon thread count: 9
Thread count: 10</pre><br />
Since the snippets above do the same thing, it's safe to say that asynchronous files channels aren't necessarily faster then conventional I/O. That's an interesting result I think. It's difficult to compare conventional I/O and NIO.2 to each other in a single threaded benchmark. NIO.2 was introduced to provide an I/O technique in highly concurrent scenarios. Therefore asking what's faster - NIO or conventional I/O - isn't quite the right question. The appropriate question was: what is "more concurrent"? However, for now, the results above suggest:<br />
<div style="text-align: center;"><blockquote class="tr_bq" style="background-color: #cfe2f3;">Consider using conventional I/O when only one thread is issueing I/O-operations.</blockquote></div>That's enough for now. I have explained the basic concepts and also pointed out that conventional I/O still has its right to exist. In the second post I will introduce some of the issues you may encounter when you use default asynchronous file channels. I will also show how to avoid those issues by applying some more viable settings.<br />
<br />
The NIO.2 file channels series:<br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part.html">Introduction</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/04/java-7-asynchronous-file-channels-part_05.html">Applying custom thread pools</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-9-nio2-file-channels-on-test.html">Closing file channels without loosing data</a><br />
- <a href="http://niklasschlimm.blogspot.de/2012/05/java-7-10-nio2-file-channels-on-test.html">I/O operations are not atomic<br />
</a><br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com9tag:blogger.com,1999:blog-5701415790759755571.post-63510478353352326632012-02-02T15:40:00.000+01:002012-02-02T15:40:37.334+01:00Java 7: A complete invokedynamic exampleAnother blog entry in my current Java 7 series. This time it's dealing with <code>invokedynamic</code>, a new bytecode instruction on the JVM for method invocation. The <code>invokedynamic</code> instruction allows dynamic linkage between a call site and the receiver of the call. That means you can link the class that is performing a method call to the class (and method) that is receiving the call <i>at run-time</i>. All the other JVM bytecode instructions for method invocation, like <code>invokevirtual</code>, hard-wire the target type information into your compilation, i.e. into your class file. Let's look at an example. <a name='more'></a><br />
<br />
<script src="https://gist.github.com/1723147.js"></script><br />
The bytecode snippet above shows an <code>invokevirtual</code> method call of <code>java.lang.String -> length()</code> in line 20. It refers to item 65 in the contsant pool table which is a <code>MethodRef</code> entry (see line 6). Items 42 and 66 in the constant pool table refer to the class and the method descriptor entries. As you can see, the target type and method of the <code>invokevirtual</code> call is completely resolved and hard-wired into the bytecode. Now, let's return to <code>invokedynamic</code>!<br />
<br />
It is important to notice that it is not possible to compile Java code into bytecode that contains an <code>invokedynamic</code> instruction. Java is <a href="http://docs.oracle.com/javase/7/docs/technotes/guides/vm/multiple-language-support.html#typing">statically typed</a>. That means that Java performs type checking at compile time. Therefore, in Java, it is possible (and wanted!) to hard-wire all type information of method call receivers into the callers class file. The caller knows the type name of the call target, as demonstrated in our example above. The use of <code>invokedynamic</code> - on the other hand - enables the JVM to resolve exactly that type information at run-time. This is only required (and wanted!) for dynamic languages, such as JRuby or Rhino. <br />
<br />
Now, suppose you want to implement a new language on the JVM that is dynamically typed. I am not suggesting you should invent *another* language on the JVM, but *suppose* you would, and *suppose* your new language should be dynamically typed. That would mean, in your new language, the linking between a caller and a receiver of a method call is performed at run-time. Since Java 7 this is possible on the bytecode level using the <code>invokedynamic</code> instruction. <br />
<br />
Because I cannot create an <code>invokedynamic</code> instruction using a Java compiler, I will create a class file that contains <code>invokedynamic</code> myself. Once this class file is created I will run that class file's <code>main</code> method using an ordinary <code>java</code> launcher. How can you create a class file without a compiler? This is possible by using bytecode manipulation frameworks like <a href="http://asm.ow2.org/">ASM </a>or <a href="http://www.csg.is.titech.ac.jp/%7Echiba/javassist/">Javassist</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKCSN15nrHv8_ak6qsXhsMnnZspiahKz0QvFJg-d0IKbjo3EKVm_Y7fqLDJusE5XlVRd2deWM1oXaBgJizm_lVXAjMkca_I4L-Jd_Kf9R706qN4DAfRg62yNXXQU363n0jkUfNvjQZQEk/s1600/Foto.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKCSN15nrHv8_ak6qsXhsMnnZspiahKz0QvFJg-d0IKbjo3EKVm_Y7fqLDJusE5XlVRd2deWM1oXaBgJizm_lVXAjMkca_I4L-Jd_Kf9R706qN4DAfRg62yNXXQU363n0jkUfNvjQZQEk/s320/Foto.JPG" width="320" /></a></div><br />
The following code snippet shows the <code>SimpleDynamicInvokerGenerator</code> that can generate a class file <code>SimpleDynamicInvoker.class</code> which contains an invokedynamic instruction.<br />
<br />
<script src="https://gist.github.com/1710583.js">
</script><br />
I am using <a href="http://asm.ow2.org/">ASM</a> here, an all purpose Java bytecode manipulation and analysis framework, to do the job of creating a correct class file format. In line 30 the <code>visitInvokeDynamicInsn</code> creates the <code>invokedynamic</code> instruction. Generating a class that does an <code>invokedynamic</code> call is only half of the story. You also need some code that links the dynamic call site to the actual target, this is the real purpose of <code>invokedynamic</code>. Here is an example.<br />
<br />
<script src="https://gist.github.com/1710613.js">
</script><br />
The bootstrap method in line 9-14 selects the actual target of the dynamic call. In our case the target is the <code>sayHello()</code> method. To learn how the bootstrap method is linked to the <code>invokedynamic</code> instruction we need to dive into the bytecode of <code>SimpleDynamicInvoker</code> that we've generated with <code><a href="https://github.com/nschlimm/playground/blob/master/bytecode-playground/src/main/java/com/schlimm/bytecode/invokedynamic/generator/SimpleDynamicInvokerGenerator.java">SimpleDynamicInvokerGenerator</a></code>. <br />
<br />
<script src="https://gist.github.com/1710655.js">
</script><br />
In line 49 you can see the <code>invokedynamic</code> instruction. The logical name of the dynamic method is <code>runCalculation</code>, this is a fictitious name. You can use any name that makes sense, also names like "+" are allowed. The instruction refers to item 20 in the contant pool table (see line 33). This in turn refers to index 0 in the <code>BootstrapMethods</code> attribute (see line 8). There you can see the link to the <code><a href="https://github.com/nschlimm/playground/blob/master/bytecode-playground/src/main/java/com/schlimm/bytecode/invokedynamic/linkageclasses/SimpleDynamicLinkageExample.java">SimpleDynamicLinkageExample.bootstrapDynamic</a></code> method that links the <code>invokedynamic</code> instruction to the call target.<br />
<br />
Now if you call the <code>SimpleDynamicInvoker</code> using the <code>java</code> launcher, then the <code>invokedynamic</code> call is executed.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHjUhyuUb6MPjEeuCxjdgVJBofG4DZrTG1oGxuj_VXoDMbfdvWN2U31iPB2eeTHdDJ3NTh_e4FaJn2XAXs_B66SwWoRHMNR3dKy1NgN_Qg2_HrIqiwqeyf2Niv2cwk2emXrukdcQRDYhc/s1600/invoke.bmp" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHjUhyuUb6MPjEeuCxjdgVJBofG4DZrTG1oGxuj_VXoDMbfdvWN2U31iPB2eeTHdDJ3NTh_e4FaJn2XAXs_B66SwWoRHMNR3dKy1NgN_Qg2_HrIqiwqeyf2Niv2cwk2emXrukdcQRDYhc/s320/invoke.bmp" width="320" /></a></div><br />
The following sequence diagram illustrates what's happening when the <code>SimpleDynamicInvoker</code> is called using the <code>java</code> launcher.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirM-7zOBEWW8dr2FHRarB7VaPfrVaGPPZJBM8-uDn7BLvJOSlgFhEu4vmgMHWI-1Qu9TtEn-dczWsDc2I05_Ph9uuTXAHk3kOUEuS1Mn9H_p1nfUw4hhscE_R6RSagnHD2I26ll2F-TT4/s1600/Unbenannt.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirM-7zOBEWW8dr2FHRarB7VaPfrVaGPPZJBM8-uDn7BLvJOSlgFhEu4vmgMHWI-1Qu9TtEn-dczWsDc2I05_Ph9uuTXAHk3kOUEuS1Mn9H_p1nfUw4hhscE_R6RSagnHD2I26ll2F-TT4/s320/Unbenannt.JPG" width="320" /></a></div><br />
The first call of <code>runCalculation</code> using <code>invokedynamic</code> issues a call to the <code>bootstrapDynamic</code> method. This method does the dynamic linkage between the calling class (<code>SimpleDynamicInvoker</code>) and the receiving class (<code>SimpleDynamicLinkageExample</code>). The bootstrap method returns a <code>MethodHandle</code> that targets the receiving class. This method handle is cached for repetitive invocations of the <code>runCalculation</code> method.<br />
<br />
That's all in terms of <code>invokedynamic</code>. I have some more sophisticated examples published <a href="https://github.com/nschlimm/playground/tree/master/bytecode-playground/src/main/java/com/schlimm/bytecode/invokedynamic">here</a> in my Git repo. I hope you've enjoyed reading this - in times of shortage!<br />
<br />
Cheers, Niklas<br />
<br />
References:<br />
<br />
<a href="http://docs.oracle.com/javase/7/docs/technotes/guides/vm/multiple-language-support.html">http://docs.oracle.com/javase/7/docs/technotes/guides/vm/multiple-language-support.html</a><br />
<a href="http://asm.ow2.org/">http://asm.ow2.org/</a><br />
<a href="http://java.sun.com/developer/technicalArticles/DynTypeLang/">http://java.sun.com/developer/technicalArticles/DynTypeLang/</a><br />
<a href="http://asm.ow2.org/doc/tutorial-asm-2.0.html">http://asm.ow2.org/doc/tutorial-asm-2.0.html</a><br />
<a href="http://weblogs.java.net/blog/forax/archive/2011/01/07/calling-invokedynamic-java">http://weblogs.java.net/blog/forax/archive/2011/01/07/calling-invokedynamic-java</a><br />
<a href="http://nerds-central.blogspot.com/2011/05/performing-dynamicinvoke-from-java-step.html">http://nerds-central.blogspot.com/2011/05/performing-dynamicinvoke-from-java-step.html</a><br />
<br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com11tag:blogger.com,1999:blog-5701415790759755571.post-70764829200994722492012-01-17T17:11:00.218+01:002012-03-02T07:31:12.922+01:00Java 7: How to write really fast Java codeWhen I first wrote this blog my intention was to introduce you to a class <code>ThreadLocalRandom</code> which is new in Java 7 to generate random numbers. I have analyzed the performance of <code>ThreadLocalRandom</code> in a series of micro-benchmarks to find out how it performs in a single threaded environment. The results were relatively surprising: although the code is very similar, <code>ThreadLocalRandom</code> is twice as fast as <code>Math.random()</code>! The results drew my interest and I decided to investigate this a little further. I have documented my anlysis process. It is an examplary introduction into analysis steps, technologies and some of the JVM diagnostic tools required to understand differences in the performance of small code segments. Some experience with the described toolset and technologies will enable you to write faster Java code for your specific Hotspot target environment.<a name='more'></a><br />
<br />
OK, that's enough talk, let's get started! My machine is an ordinary Intel x86, Family 6, 3 GHz, 32-bit, dual core running Windows XP Professional.<br />
<br />
<code>Math.random()</code> works on a static singleton instance of <code>Random</code> whilst <code>ThreadLocalRandom -> current() -> nextDouble()</code> works on a thread local instance of <code>ThreadLocalRandom</code> which is a subclass of <code>Random</code>. <code>ThreadLocal</code> introduces the overhead of variable look up on each call to the <code>current()</code>-method. Considering what I've just said, then it's really a little surprising that it's twice as fast as <code>Math.random()</code> in a single thread, isn't it? I didn't expect such a significant difference. <br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGL3U0VtrJDKMZ-hhjzeTAm3Rub6XnAVc_wZUT4Doavv1yNy2irbDyPIHr9rKFHAksAIH1WXUkR2ev4mDKlqEvlPrSxJX8__4bAf0n5Zt7_KYz5_M86eyZPR1q2KilE2tE-49UTW5Dnx0/s1600/ThreadLocalRandom.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGL3U0VtrJDKMZ-hhjzeTAm3Rub6XnAVc_wZUT4Doavv1yNy2irbDyPIHr9rKFHAksAIH1WXUkR2ev4mDKlqEvlPrSxJX8__4bAf0n5Zt7_KYz5_M86eyZPR1q2KilE2tE-49UTW5Dnx0/s320/ThreadLocalRandom.JPG" width="320" /></a></div><br />
Again, I am using a tiny micro-benchmarking framework presented <a href="http://www.javaspecialists.eu/archive/Issue124.html">in one of Heinz blogs</a>. The framework that Heinz developed takes care of several challenges in benchmarking Java programs on modern JVMs. These challenges include: warm-up, garbage collection, accuracy of Javas time API, verification of test accuracy and so forth. <br />
<br />
Here are my runnable benchmark classes:<br />
<br />
<pre class="java" name="code">public class ThreadLocalRandomGenerator implements BenchmarkRunnable {
private double r;
@Override
public void run() {
r = r + ThreadLocalRandom.current().nextDouble();
}
public double getR() {
return r;
}
@Override
public Object getResult() {
return r;
}
}
public class MathRandomGenerator implements BenchmarkRunnable {
private double r;
@Override
public void run() {
r = r + Math.random();
}
public double getR() {
return r;
}
@Override
public Object getResult() {
return r;
}
}
</pre><br />
Let's run the benchmark using Heinz' framework:<br />
<br />
<script src="https://gist.github.com/1583786.js">
</script><br />
Notice: To make sure the JVM does not identify the code as "dead code" I return a field variable and print out the result of my benchmarking immediately. That's why my runnable classes implement an interface called <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/random/BenchmarkRunnable.java">RunnableBenchmark</a>. I am running this benchmark three times. The first run is in default mode, with inlining and JIT optimization enabled:<br />
<br />
<pre class="java" name="code">Benchmark target: MathRandomGenerator
Mean execution count: 14773594,4
Standard deviation: 180484,9
To avoid dead code coptimization: 6.4005410634212025E7
Benchmark target: ThreadLocalRandomGenerator
Mean execution count: 29861911,6
Standard deviation: 723934,46
To avoid dead code coptimization: 1.0155096190946539E8
</pre><br />
Then again without JIT optimization (VM option <code>-Xint</code>):<br />
<br />
<pre class="java" name="code">Benchmark target: MathRandomGenerator
Mean execution count: 963226,2
Standard deviation: 5009,28
To avoid dead code coptimization: 3296912.509302683
Benchmark target: ThreadLocalRandomGenerator
Mean execution count: 1093147,4
Standard deviation: 491,15
To avoid dead code coptimization: 3811259.7334526842
</pre><br />
The last test is with JIT optimization, but with <code>-XX:MaxInlineSize=0</code> which (almost) disables inlining:<br />
<br />
<pre class="java" name="code">Benchmark target: MathRandomGenerator
Mean execution count: 13789245
Standard deviation: 200390,59
To avoid dead code coptimization: 4.802723374491231E7
Benchmark target: ThreadLocalRandomGenerator
Mean execution count: 24009159,8
Standard deviation: 149222,7
To avoid dead code coptimization: 8.378231170741305E7
</pre><br />
Let's interpret the results carefully: With full JVM JIT optimization the <code>ThreadLocalRanom</code> is twice as fast as <code>Math.random()</code>. Turning JIT optimization off shows that the two perform equally good (bad) then. Method inlining seems to make 30% of the performance difference. The other differences may be due to <a href="http://www.oracle.com/technetwork/java/whitepaper-135217.html">other otimization techniques</a>.<br />
<br />
One reason why the JIT compiler can tune <code>ThreadLocalRandom</code> more effectively is the improved implementation of <code>ThreadLocalRandom.next()</code>. <br />
<br />
<pre class="java" name="code">public class Random implements java.io.Serializable {
...
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
...
}
public class ThreadLocalRandom extends Random {
...
protected int next(int bits) {
rnd = (rnd * multiplier + addend) & mask;
return (int) (rnd >>> (48-bits));
}
...
}
</pre><br />
The first snippet shows <code>Random.next()</code> which is used intensively in the benchmark of <code>Math.random()</code>. Compared to <code>ThreadLocalRandom.next()</code> the method requires significantly more instructions, although both methods do the same thing. In the <code>Random</code> class the <code>seed</code> variable stores a global shared state to all threads, it changes with every call to the <code>next()</code>-method. Therefore <code>AtomicLong</code> is required to safely access and change the <code>seed</code> value in calls to <code>nextDouble()</code>. <code>ThreadLocalRandom</code> on the other hand is - well - thread local :-) The <code>next()</code>-method does not have to be thread safe and can use an ordinary <code>long</code> variable as seed value. <br />
<br />
<b>About method inlining and <code>ThreadLocalRandom</code><br />
</b><br />
One very effective JIT optimization is method inlining. In hot paths executed frequently the hotspot compiler decides to inline the code of called methods (child method) into the callers method (parent method). "Inlining has important benefits. It dramatically reduces the dynamic frequency of method invocations, which saves the time needed to perform those method invocations. But even more importantly, inlining produces much larger blocks of code for the optimizer to work on. This creates a situation that significantly increases the effectiveness of traditional compiler optimizations, overcoming a major obstacle to increased Java programming language performance."<br />
<br />
Since <a href="http://download.java.net/jdk7/archive/b142/binaries/">OpenJDK 7 debug builds</a> you can monitor method inlining by using diagnostic JVM options. Running the code with '<code>-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining</code>' will show the inlining efforts of the JIT compiler. Here are the relevant sections of the output for <code>Math.random()</code> benchmark:<br />
<br />
<pre class="java" name="code">@ 13 java.util.Random::nextDouble (24 bytes)
@ 3 java.util.Random::next (47 bytes) callee is too large
@ 13 java.util.Random::next (47 bytes) callee is too large
</pre><br />
The JIT compiler cannot inline the <code>Random.next()</code> method that is called in <code>Random.nextDouble()</code>. This is the inlining output of <code>ThreaLocalRandom.next()</code>:<br />
<br />
<pre class="java" name="code">@ 8 java.util.Random::nextDouble (24 bytes)
@ 3 java.util.concurrent.ThreadLocalRandom::next (31 bytes)
@ 13 java.util.concurrent.ThreadLocalRandom::next (31 bytes)
</pre><br />
Due to the fact that the <code>next()</code>-method is shorter (31 bytes) it can be inlined. Because the <code>next()</code>-method is called intensively in both benchmarks this log suggests that method inlining may be one reason why <code>ThreadLocalRandom</code> performs significantly faster. <br />
<br />
To verify that and to find out more it is required to deep dive into assembly code. With Java 7 JDKs it is possible to print out assembly code into the console. See <a href="https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly">here</a> on how to enable <code>-XX:+PrintAssembly</code> VM Option. The option will print out the JIT optimized code, that means you can see the code the JVM actually executes. I have copied the relevant assembly code into the links below.<br />
<br />
Assembly code of ThreadLocalRandomGenerator.run() <a href="https://gist.github.com/1583170">here</a>.<br />
Assembly code of MathRandomGenerator.run() <a href="https://gist.github.com/1583188">here</a>.<br />
Assembly code of Random.next() called by Math.random() <a href="https://gist.github.com/1583197">here</a>.<br />
<br />
<a href="http://en.wikipedia.org/wiki/X86_instruction_listings">Assembly code</a> is machine-specific and low level code, it's more complicated to read then <a href="http://en.wikipedia.org/wiki/Java_bytecode_instruction_listings">bytecode</a>. Let's try to verify that method inlining has a relevant effect on performance in my benchmarks and: are there other obvious differences how the JIT compiler treats <code>ThreadLocalRandom</code> and <code>Math.random</code>()? In <code>ThreadLocalRandomGenerator.run()</code> there is no procedure call to any of the subroutines like <code>Random.nextDouble()</code> or <code>ThreatLocalRandom.next()</code>. There is only one virtual (hence expensive) method call to <code>ThreadLocal.get()</code> visible (see line 35 in <code>ThreadLocalRandomGenerator.run()</code> assembly). All the other code is inlined into <code>ThreadLocalRandomGenerator.run()</code>. In the case of <code>MathRandomGenerator.run()</code> there are <i>two</i> virtual method calls to <code>Random.next()</code> (see block B4 line 204 ff. in the assembly code of <code>MathRandomGenerator.run()</code>). This fact confirms our suspicion that method inlining is one important root cause for the performance difference. Further more, due to synchronization hassle, there are considerably more (and some expensive!) assembly instructions required in <code>Random.next()</code> which is also counterproductive in terms of execution speed.<br />
<br />
<b>Understanding the overhead of the <code>invokevirtual</code> instruction</b><br />
<br />
So why is (virtual) method invocation expensive and method inlining so effective? The pointer of <code>invokevirtual</code> instructions is not an offset of a concrete method in a class instance. The compiler does not know the internal layout of a class instance. Instead, it generates symbolic references to the methods of an instance, which are stored in the runtime constant pool. Those runtime constant pool items are resolved <i>at run time</i> to determine the actual method location. This dynamic (run-time) binding requires verification, preparation and resolution which can considerably effect performance. (see <a href="http://java.sun.com/docs/books/jvms/second_edition/html/Compiling.doc.html#14787">Invoking Methods</a> and <a href="http://java.sun.com/docs/books/jvms/second_edition/html/Concepts.doc.html#22574">Linking</a> in the JVM Spec for details)<br />
<br />
That's all for now. The disclaimer: Of course, the list of topics you need to understand to solve performance riddles is endless. There is a lot more to understand then micro-benchmarking, JIT optimization, method inlining, java byte code, assemby language and so forth. Also, there are lot more root causes for performance differences then just virtual method calls or expensive thread synchronization instructions. However, I think the topics I have introduced are a good start into such deep diving stuff. Looking forward to critical and enjoyable comments!<br />
<br />
Cheers,<br />
NiklasNiklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com21tag:blogger.com,1999:blog-5701415790759755571.post-46118226416873680992011-12-28T19:13:00.009+01:002011-12-29T18:28:35.261+01:00Java 7: Understanding the PhaserJava 7 introduces a flexible thread synchronization mechanism called <code>Phaser</code>. If you need to wait for threads to arrive before you can continue or start another set of tasks, then <code>Phaser</code> is a good choice. Here is the listing, everything is explained step-by-step.<br />
<a name='more'></a><pre class="java" name="code">import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.concurrent.Phaser;
public class PhaserExample {
public static void main(String[] args) throws InterruptedException {
List<runnable> tasks = new ArrayList<>();
for (int i = 0; i < 2; i++) {
Runnable runnable = new Runnable() {
@Override
public void run() {
int a = 0, b = 1;
for (int i = 0; i < 2000000000; i++) {
a = a + b;
b = a - b;
}
}
};
tasks.add(runnable);
}
new PhaserExample().runTasks(tasks);
}
void runTasks(List<runnable> tasks) throws InterruptedException {
final Phaser phaser = new Phaser(1) {
protected boolean onAdvance(int phase, int registeredParties) {
return phase >= 1 || registeredParties == 0;
}
};
for (final Runnable task : tasks) {
phaser.register();
new Thread() {
public void run() {
do {
phaser.arriveAndAwaitAdvance();
task.run();
} while (!phaser.isTerminated());
}
}.start();
Thread.sleep(500);
}
phaser.arriveAndDeregister();
}
}
</pre><br />
This example allows to learn a lot about the internals of a <code>Phaser</code>. Let's go through the code:<br />
<br />
<b>Line 8-26</b>: The <code>main</code>-Method that creates two <code>Runnable</code> tasks<br />
<b>Line 28</b>: Task list is passed to the <code>runTasks</code>-Method<br />
<br />
The <code>runTasks</code>-Method actually uses a <code>Phaser</code> to synchronize the tasks in a way that each task in the list needs to arrive at the barrier before they are executed in parallel. The task list is executed twice. The first cycle is started when both threads arrived at the barrier (see image mark 1). The second cycle is started when both threads arrived at the barrier (see image mark 2).<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4cHZ2jHnPRC8Rbd-pnsGdlWMsqD2SR3gBSiGEvuO4cKtUYf1yZwxs1DaR0JHJKJRjxCT5ZKiMbiAEUTogog0wRa1NxK7CZRg6wVrZfPJClRhFGnHtGIS-UP-VIP075X5cNQsLBGGrR9g/s1600/Unbenannt.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4cHZ2jHnPRC8Rbd-pnsGdlWMsqD2SR3gBSiGEvuO4cKtUYf1yZwxs1DaR0JHJKJRjxCT5ZKiMbiAEUTogog0wRa1NxK7CZRg6wVrZfPJClRhFGnHtGIS-UP-VIP075X5cNQsLBGGrR9g/s320/Unbenannt.JPG" width="320" /></a></div><br />
<blockquote style="background-color: #cfe2f3;">Notice: "party" is a term in the <code>Phaser</code> context that is equivalent to what we mean by a thread. When one party arrives, then one thread arrived at the synchronization barrier.</blockquote><br />
<b>Line 34</b>: create a <code>Phaser</code> that has one registered party (this means: at this time phaser expects one thread=party to arrive before it can start the execution cycle)<br />
<b>Line 35</b>: implement the <code>onAdvance</code>-Method to explain that this task list is executed twice (done by: Line 36 says that it returns true if phase is equal or higher then 1)<br />
<b>Line 40</b>: iterate over the list of tasks<br />
<b>Line 41</b>: register this thread with the <code>Phaser</code>. Notice that a <code>Phaser</code> instance does not know the task instances. <span style="background-color: #ffe599;">It's a simple counter of registered, unarrived and arrived parties, shared across participating threads.</span> If two parties are registered then two parties must arrive at the phaser to be able to start the first cycle.<br />
<b>Line 45</b>: tell the thread to wait at the barrier until the arrived parties equal the registered parties<br />
<b>Line 50</b>: Just for demonstration purposes, this line delays execution. The <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/phaser/PhaserExample.java">original code snippet</a> prints internal infos about the Phaser state to standard out.<br />
<b>Line 51</b>: two tasks are registered, in total three parties are registered.<br />
<b>Line 53</b>: deregister one party. This results in two registered parties and two arrived parties. This causes the threads waiting (Line 45) to execute the first cycle. (in fact the third party arrived while three were registered - but it does not make a difference)<br />
<br />
<a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/phaser/PhaserExample.java">The original code snippet</a> stored in my Git repository creates the following output:<br />
<br />
<pre class="java" name="code">After phaser init -> Registered: 1 - Unarrived: 1 - Arrived: 0 - Phase: 0
After register -> Registered: 2 - Unarrived: 2 - Arrived: 0 - Phase: 0
After arrival -> Registered: 2 - Unarrived: 1 - Arrived: 1 - Phase: 0
After register -> Registered: 3 - Unarrived: 2 - Arrived: 1 - Phase: 0
After arrival -> Registered: 3 - Unarrived: 1 - Arrived: 2 - Phase: 0
Before main thread arrives and deregisters -> Registered: 3 - Unarrived: 1 - Arrived: 2 - Phase: 0
On advance -> Registered: 2 - Unarrived: 0 - Arrived: 2 - Phase: 0
After main thread arrived and deregistered -> Registered: 2 - Unarrived: 2 - Arrived: 0 - Phase: 1
Main thread will terminate ...
Thread-0:go :Wed Dec 28 16:09:16 CET 2011
Thread-1:go :Wed Dec 28 16:09:16 CET 2011
Thread-0:done:Wed Dec 28 16:09:20 CET 2011
Thread-1:done:Wed Dec 28 16:09:20 CET 2011
On advance -> Registered: 2 - Unarrived: 0 - Arrived: 2 - Phase: 1
Thread-0:go :Wed Dec 28 16:09:20 CET 2011
Thread-1:go :Wed Dec 28 16:09:20 CET 2011
Thread-1:done:Wed Dec 28 16:09:23 CET 2011
Thread-0:done:Wed Dec 28 16:09:23 CET 2011
</pre><br />
<b>Line 1</b>: when the <code>Phaser</code> is initialized in line 34 of the code snippet then one party is registered and none arrived<br />
<b>Line 2</b>: after the first thread is registered in Line 41 in the code example there are two registered parties and two unarrived parties. Since no thread reached the barrier yet, no party is arrived.<br />
<b>Line 3</b>: the first thread arrives and waits at the barrier (line 45 in the code snippet)<br />
<b>Line 4</b>: register the second thread, three registered, two unarrived, one arrived<br />
<b>Line 5</b>: the second thread arrived at the barrier, hence two arrived now<br />
<b>Line 7</b>: one party is deregistered in the code line 53 of the code example, therefore <code>onAdvance</code>-Method is called and returns <code>false</code>. This starts the first cycle since registered parties equals arrived parties (i.e. two). Phase 1 is started -> cycle one (see image mark 1)<br />
<b>Line 8</b>: since all threads are notified and start their work, two parties are unarrived again, non arrived<br />
<b>Line 14</b>: After the threads executed their tasks once they arrive again (code line 45) the <code>onAdvance</code>-Method is called, now the 2nd cycle is executed<br />
<br />
OK, go through it and look into my comments in <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/phaser/PhaserExample.java">the original code snippet</a> to learn more.Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com9tag:blogger.com,1999:blog-5701415790759755571.post-60089536179949214242011-12-15T18:02:00.003+01:002011-12-16T09:53:08.725+01:00Java 7: Fork and join decomposable input patternIn my <a href="http://niklasschlimm.blogspot.com/2011/12/java-7-fork-and-join-and-jar-jam.html">recent blog</a> I have introduced the fork and join framework of Java 7. This blog presents a little framework on top of raw fork and join. The framework implements the decomposable input pattern (dip) - which originated from my own laziness when I was using the framework a couple of times. I have realized that I was writing the same code every time when I was implementing a slightly different use case. And you know, let's write a little peace of software that I can reuse. The decomposable input pattern framwork was born. <br />
<br />
<a name='more'></a>You can <a href="https://github.com/nschlimm/playground/blob/master/forkjoindip-project-build/forkjoindip-project-1.0-SNAPSHOT.jar?raw=true">download the binary here</a>.<br />
The <a href="http://nschlimm.github.com/playground/forkjoindip-project-build/javadoc/apidocs/index.html">API-documentation is hosted here</a>.<br />
And the <a href="https://github.com/nschlimm/playground/blob/master/forkjoindip-project-build/forkjoindip-project-1.0-SNAPSHOT-sources.jar?raw=true">sources are also available here</a>.<br />
<br />
Now what's different when you use that framework? I'd say the difference is that the dip-framework follows good <a href="http://www.objectmentor.com/resources/articles/Principles_and_Patterns.pdf">OO design principles</a>, like the open-closed-principle that says: "A module should be open for extension but closed for modification." In other words I have seperated concerns in a fork and join scenario to make the whole more flexible and easy to change.<br />
<br />
In my last blog I presented a code snippet that illustrated how to use plain fork and join to calculate offers of car insurances. Let's see how this can be done using my dip-framwork.<br />
<br />
The input to the proposal calculation is - well - a list of proposals :-) In the dip framework you wrap the input of a <code>ForkJoinTask</code> into a subclass of <code>DecomposableInput</code>. The name originates from the fact that input to <code>ForkJoinTask</code> is decomposable. Here is the snippet:<br />
<br />
<script src="https://gist.github.com/1481682.js">
</script><br />
<br />
The class wraps the raw input to <code>ForkJoinTask</code> and provides a method how that input can be decomposed. Also, it provides a method <code>computeDirectly()</code> that can decide on whether this input needs further decomposition to be small enough for direct computation.<br />
<br />
The output of proposal calculation is a list of maps of prices. If you have four input proposals, you'll get a list of four maps with various prices. In the dip framework, you wrap the output into a subclass of <code>ComposableResult</code>.<br />
<br />
<script src="https://gist.github.com/1481730.js">
</script><br />
<br />
The class implements the <code>compose</code> method that can compose an atomic result of a computation into the existing raw result. It returns a <code>ComposableResult</code> instance that holds the new composition.<br />
<br />
I agree it's a little abtsract. Not only that concurrency is inherently complex. I am also putting another abstraction onto it. But once you've used the framework you'll realize the strength. So stay tuned, we're almost finnished :-)<br />
<br />
Now, you have an input and an ouptut and the last thing you need is a computation object. In my example that's the pricing engine. To connect the pricing engine to the dip framework, you'll need to implement a subclass of <code>ComputationActivityBridge</code>. <br />
<br />
<script src="https://gist.github.com/1481754.js">
</script><br />
<br />
The PricingEngineBridge implements the <code>compute</code> method that calls the pricing engine. It translates the <code>DecomposableInput</code> into an input that the pricing engine accepts. And it creates an instance of <code>ComposableResult</code> that contains the output of the pricing engine.<br />
<br />
Last thing to do is to get the stuff started.<br />
<br />
<script src="https://gist.github.com/1481770.js">
</script><br />
<br />
The example creates an instance of <code>GenericRecursiveTask</code> and passes the <code>ListOfProposals</code> as well as the <code>PricingEngineBrige</code> as input. If you pass that to the <code>ForkJoinPool</code> then you receive an instance of <code>ListOfPrices</code> as ouput.<br />
<br />
What's the advantage when you use the dip-framework? For instance:<br />
<br />
- you could pass arbitrary processing input to <code>GenericRecursiveTask</code> by implementing a subclass of <code>DecomposableInput</code><br />
- you could implement your own custom <code>RecursiveTask</code> the same way I have implemented <code>GenericRecursiveTask</code> and pass the proposals and the <code>PricingEngineBridge</code> to that task<br />
- you could implement a custom <code>ForkAndJoinProcessor</code> and use that by subclassing <code>GenericRecursiveTask</code>: that way you can control the creation of subtask and their distribution across threads<br />
- you could exchange the processing activity (here: <code>PricingEngineBridge</code>) by implementing a custom <code>ComputationActivityBridge</code> and try alternative pricing engines or make something completely different then calculating prices ...<br />
<br />
I think I have made my point: the whole is closed for modification, but open for extention now.<br />
<br />
The complete example code is <a href="https://github.com/nschlimm/playground/tree/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/forkjoin/dippricingengine">here in my git repository</a>.<br />
<br />
Let me know if you like it. Looking forward to critical and enjoyable comments.<br />
<br />
Cheers, NiklasNiklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com5tag:blogger.com,1999:blog-5701415790759755571.post-13809243845098765422011-12-09T16:01:00.010+01:002011-12-11T17:39:58.625+01:00Java 7: Fork and join and the jam jarAnother Java 7 blog, this time it's about the new concurrency utilities. It's about plain fork and join in particular. Everything is explained with a straight code example. Compared to Project Coin concurrency is an inherently complex topic. The code example is a little more complex. Let's get started.<br />
<br />
<a name='more'></a><b>The Fork and Join Executor Service<br />
</b><br />
Fork and join employs an efficient task scheduling algorithm that ensures optimized resource usage (memory and cpu) on multi core machines. That algorithm is known as "<a href="http://supertech.csail.mit.edu/papers/steal.pdf">work stealing</a>". Idle threads in a fork join pool attempt to find and execute subtasks created by other active threads. This is very efficient 'cause larger units get divided into smaller units of work that get distributed accross all active threads (and CPU's). Here is an analogy to explain the strength of fork and join algorithms: if you have a jam jar and you fill it with ping-pong balls, there is a lot air left in the glass. Think of the air as unused CPU resource. If you fill your jam jar with peas (or sand) there is less air in the glass. Fork and join is like filling the jam jar with peas. There is also more volume in your glass using peas, 'cause there is less air (less waste). Fork and join algorithms always ensure an optimal (smaller) number of active threads then work sharing algorithms. This is for the same "peas reason". Think of the jam jar being your thread pool and the peas are your tasks. With fork and join you can host more tasks (and complete volume) with the same amount of threads (in the same jam jar).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCg4AOBcrLbNihO_krbYeu3gh1YsiSO1f7us5uVdEA2up_9giJG339purqjtubOZh4wilmLaZlz6eqMkIrMYiErCTNx8KsInuFKNLA9jgMQHHgYqrZcAuDCUokoDyOPUP61d2ptEZ7q_Y/s1600/weird.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="172" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCg4AOBcrLbNihO_krbYeu3gh1YsiSO1f7us5uVdEA2up_9giJG339purqjtubOZh4wilmLaZlz6eqMkIrMYiErCTNx8KsInuFKNLA9jgMQHHgYqrZcAuDCUokoDyOPUP61d2ptEZ7q_Y/s320/weird.JPG" width="267" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 1: Fork and join in the jam jar</td></tr>
</tbody></table><br />
Here is a plain fork and join code example:<br />
<br />
<script src="https://gist.github.com/1451620.js">
</script><br />
<br />
Fork and join tasks always have a similar typical fork and join control flow. In my example I do want to calculate the prices for a list of car insurance offers. Let's go through the example.<br />
<br />
<b>Line 10</b>: Fork and join tasks extend <code>RecursiveTask</code> or <code>RecursiveAction</code>. Tasks do return a result, actions doesn't. <code>RecursiveTask</code>s allow to specify the return type using generics. The result of my example is a <code>List</code> of <code>Map</code>s which contain the prices for the car insurance covers. One map of prices for each proposal. <br />
<b>Line 12</b>: The task will calculate prices for proposals.<br />
<b>Line 22</b>: Fork and join tasks implement the <code>compute</code> method. Again, the <code>compute</code> method returns a list of maps that contain prices. If there are four proposals in the input list, then there will be four maps of prices.<br />
<b>Line 24-26</b>: Is the task stack (list of proposals) small enough to compute directly? If yes, then compute in this thread, which means call the pricing engine to calculate the proposal. If no, continue: split the work and call task recursively.<br />
<b>Line 31</b>: Determine where to split the list.<br />
<b>Line 33</b>: Create a new task for the first part of the split list.<br />
<b>Line 34</b>: Fork that task: allow some other thread to perform that smaller subtask. That thread will call <code>compute</code> recursively on that subtask instance. <br />
<b>Line 35</b>: Create a new task for the second part of the split list.<br />
<b>Line 36</b>: Prepare the composed result list of the two devided subtask (you need to compose the results of the two subtwasks into a single result of the parent task)<br />
<b>Line 37</b>: Compute the second subtask in this current thread and add the result to the result list.<br />
<b>Line 38</b>: In the meantime the first subtask f1 was computed by some other thread. Join the result of the first subtask into the composed result list. <br />
<b>Line 39</b>: Return the composed result.<br />
<br />
You need to start the fork and join task. <br />
<b><br />
Line 49</b>: Create the main fork and join task with the initial list of proposals.<br />
<b>Line 53</b>: Create a fork and join thread pool.<br />
<b>Line 55</b>: Submit the main task to the fork and join pool.<br />
<br />
That's it. You can look into the complete code <a href="https://github.com/nschlimm/playground/tree/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/forkjoin/pricingengine">here</a>. You'll need the <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/forkjoin/pricingengine/PricingEngine.java">PricingEngine.java</a> and the <a href="https://github.com/nschlimm/playground/blob/master/java7-playground/src/main/java/com/schlimm/java7/concurrency/forkjoin/pricingengine/Proposal.java">Proposal.java</a>.<br />
<br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com3tag:blogger.com,1999:blog-5701415790759755571.post-34995785008940123922011-12-02T08:19:00.000+01:002011-12-02T08:19:09.821+01:00Java 7: Project Coin in code examplesThis blog introduces - by code examples - some new Java 7 features summarized under the term <a href="http://openjdk.java.net/projects/coin/">Project Coin</a>. The goal of Project Coin is to add a set of small language changes to JDK 7. These changes do simplify the Java language syntax. Less typing, cleaner code, happy developer ;-) Let's look into that.<br />
<br />
<a name='more'></a><br />
<div style="background-color: #cfe2f3; text-align: center;"><b>Prerequisites</b></div><br />
Install <a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Java 7 SDK</a> on your machine<br />
Install <a href="http://eclipse.org/downloads/">Eclipse Indigo</a> 3.7.1<br />
<br />
You need to look out for the correct bundles for your operating system.<br />
<br />
In your Eclipse workspace you need to define the installed Java 7 JDK in your runtime. In the Workbench go to Window > Preferences > Java > Installed JREs and add your Java 7 home directory. <br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq9lehwZXgO0sfx_WV627GUD_udfFM8KCvQprEeuk8fMdZDUzpgxgjzYDl0cmRAPsZGdp1Mbzwj0MPcPsXOgf27LTQQE31EO3FuMg7Aw2apchcoHxrnONdrmt1VDN_jc57M2kIm7yCC_8/s1600/Installed.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="287" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq9lehwZXgO0sfx_WV627GUD_udfFM8KCvQprEeuk8fMdZDUzpgxgjzYDl0cmRAPsZGdp1Mbzwj0MPcPsXOgf27LTQQE31EO3FuMg7Aw2apchcoHxrnONdrmt1VDN_jc57M2kIm7yCC_8/s320/Installed.jpg" width="320" /></a></div><br />
Next you need to set the compiler level to 1.7 in Java > Compiler.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSTeL3rWHGe5TJwR8zj0nKKzCevcn4-lPI2UeFsACR87qrCGsFdS736Ex4OlRHXEGZfc7npboefkSlKAN_TufZLq67mUqjzjfv0P9McMg-LgpKALzJ4PhBfpYSZNGYxI-WJa8NprrST-E/s1600/Compiler.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="287" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSTeL3rWHGe5TJwR8zj0nKKzCevcn4-lPI2UeFsACR87qrCGsFdS736Ex4OlRHXEGZfc7npboefkSlKAN_TufZLq67mUqjzjfv0P9McMg-LgpKALzJ4PhBfpYSZNGYxI-WJa8NprrST-E/s320/Compiler.jpg" width="320" /></a></div><br />
<div style="background-color: #cfe2f3; text-align: center;"><b>Project Coin</b></div><br />
<b>Improved literals<br />
</b><br />
A literal is the source code representation of a fixed value.<br />
<br />
"In Java SE 7 and later, any number of underscore characters (_) can appear anywhere between digits in a numerical literal. This feature enables you to separate groups of digits in numeric literals, which can improve the readability of your code." (from <a href="http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html">the Java Tutorials</a>)<br />
<br />
<pre class="java" name="code">public class LiteralsExample {
public static void main(String[] args) {
System.out.println("With underscores: ");
long creditCardNumber = 1234_5678_9012_3456L;
long bytes = 0b11010010_01101001_10010100_10010010;
System.out.println(creditCardNumber);
System.out.println(bytes);
System.out.println("Without underscores: ");
creditCardNumber = 1234567890123456L;
bytes = 0b11010010011010011001010010010010;
System.out.println(creditCardNumber);
System.out.println(bytes);
}
}</pre><br />
Notice the underscores in the literals (e.g. 1234_5678_9012_3456L). Results written to the console:<br />
<br />
<pre class="java" name="code">With underscores:
1234567890123456
-764832622
Without underscores:
1234567890123456
-764832622</pre><br />
As you can see, the underscores do not make a difference to the values. They are just used to make the code more readible.<br />
<br />
<b>SafeVarargs</b><br />
<br />
Pre-JDK 7, you always got an unchecked warning when calling certain varargs library methods. Without the new <code>@SafeVarargs</code> annotation this example would create unchecked warnings.<br />
<br />
<pre class="java" name="code">public class SafeVarargsExample {
@SafeVarargs
static void m(List<string>... stringLists) {
Object[] array = stringLists;
List<integer> tmpList = Arrays.asList(42);
array[0] = tmpList; // compiles without warnings
String s = stringLists[0].get(0); // ClassCastException at runtime
}
public static void main(String[] args) {
m(new ArrayList<string>());
}
}</string></integer></string></pre><br />
The new annotation in line 3 does not help to get around the annoying <code>ClassCastException</code> at runtime. Also, it can only be applied to static and final methods. Therefore, I believe it will not be a great help. Future versions of Java will have compile time errors for unsafe code like the one in the example above.<br />
<br />
<b>Diamond</b><br />
<br />
In Java 6 it required some patience to create, say, list of maps. Look at this example:<br />
<br />
<script src="https://gist.github.com/1409477.js">
</script><br />
As you can see in the right part of the assignment in lines 3 and 4 you need to repeat your type information for the <code>listOfMaps</code> variable as well as of the <code>aMap</code> variable. This isn't necessary anymore in Java 7:<br />
<br />
<script src="https://gist.github.com/1409490.js">
</script><br />
<b>Multicatch</b><br />
<br />
In Java 7 you do not need a catch clause for every single exception, you can catch multiple exceptions in one clause. You remember code like this:<br />
<pre class="java" name="code">public class HandleExceptionsJava6Example {
public static void main(String[] args) {
Class string;
try {
string = Class.forName("java.lang.String");
string.getMethod("length").invoke("test");
} catch (ClassNotFoundException e) {
// do something
} catch (IllegalAccessException e) {
// do the same !!
} catch (IllegalArgumentException e) {
// do the same !!
} catch (InvocationTargetException e) {
// yeah, well, again: do the same!
} catch (NoSuchMethodException e) {
// ...
} catch (SecurityException e) {
// ...
}
}
}</pre><br />
Since Java 7 you can write it like this, which makes our lives a lot easier:<br />
<br />
<pre class="java" name="code">public class HandleExceptionsJava7ExampleMultiCatch {
public static void main(String[] args) {
try {
Class string = Class.forName("java.lang.String");
string.getMethod("length").invoke("test");
} catch (ClassNotFoundException | IllegalAccessException | IllegalArgumentException | InvocationTargetException | NoSuchMethodException | SecurityException e) {
// do something, and only write it once!!!
}
}
}</pre><br />
<b>String in switch statements</b><br />
<br />
Since Java 7 one can use string variables in switch clauses. Here is an example:<br />
<br />
<pre class="java" name="code">public class StringInSwitch {
public void printMonth(String month) {
switch (month) {
case "April":
case "June":
case "September":
case "November":
case "January":
case "March":
case "May":
case "July":
case "August":
case "December":
default:
System.out.println("done!");
}
}
}</pre><br />
<b>Try-with-resource</b><br />
<br />
This feature really helps in terms of reducing unexpected runtime execptions. In Java 7 you can use the so called try-with-resource clause that automatically closes all open resources if an exception occurs. Look at the example:<br />
<br />
<pre class="java" name="code">import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
public class TryWithResourceExample {
public static void main(String[] args) throws FileNotFoundException {
// Java 7 try-with-resource
String file1 = "TryWithResourceFile.out";
try (OutputStream out = new FileOutputStream(file1)) {
out.write("Some silly file content ...".getBytes());
":-p".charAt(3);
} catch (StringIndexOutOfBoundsException | IOException e) {
System.out.println("Exception on operating file " + file1 + ": " + e.getMessage());
}
// Java 6 style
String file2 = "WithoutTryWithResource.out";
OutputStream out = new FileOutputStream(file2);
try {
out.write("Some silly file content ...".getBytes());
":-p".charAt(3);
} catch (StringIndexOutOfBoundsException | IOException e) {
System.out.println("Exception on operating file " + file2 + ": " + e.getMessage());
}
// Let's try to operate on the resources
File f1 = new File(file1);
if (f1.delete())
System.out.println("Successfully deleted: " + file1);
else
System.out.println("Problems deleting: " + file1);
File f2 = new File(file2);
if (f2.delete())
System.out.println("Successfully deleted: " + file2);
else
System.out.println("Problems deleting: " + file2);
}
}</pre><br />
In line 14 the try-with-resource clause is used to open a file that we want to operate on. Then line 16 generates a runtime exception. Notice that I do not explicitly close the resource. This is done automatically when you use try-with-resource. It *isn't* when you use the Java 6 equivalent shown in lines 21-30.<br />
<br />
The code will write the following result to the console:<br />
<br />
<pre class="java" name="code">Exception on operating file TryWithResourceFile.out: String index out of range: 3
Exception on operating file WithoutTryWithResource.out: String index out of range: 3
Successfully deleted: TryWithResourceFile.out
Problems deleting: WithoutTryWithResource.out</pre><br />
That's it in terms of Project Coin. Very useful stuff in my eyes.Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com5tag:blogger.com,1999:blog-5701415790759755571.post-57870570778820941162011-11-13T17:37:00.002+01:002011-11-13T17:39:44.468+01:00Java EE 6 and the snowball effectJava EE application servers increase their feature sets (APIs and administration features) whilst business applications get smaller and smaller. This introduces a new issue: if you need a single feature of a new application server version you'll get a complete package of features that you didn't need in the first place (the snowball effect). Let me give you an example: in WebSphere 7 IBM provides a high speed integration adapter for IMS assets. We need that, but we don't need all the rest that gives us a headache in terms of migration efforts. Now, if the amount of APIs increase in Java EE with every version, I predict that this problem starts to get more and more complicated. That's a reason why I don't appreciate the fact that Java EE standardizes former framework functionality like dependency injection (CDI). Business applications may get smaller, but application server feature sets get huge this way. Is that a good trend? <br />
<br />
<a name='more'></a>I typically try to keep application bundles small in my business applications. Not in size necessarily but in terms of packaged features. This way I don't create artificial dependencies. You need to deploy a new web service version? Then I can only deploy that specific web service module without creating a snowball effect of additional required deployments. Small packages reduce the need of branch development. Small packages reduce the risk of instability. Small packages reduce the necessity to synchronize deployments of different development teams. If you finnish your work or you need to deploy a new version of your application then you should create as little dependency as possible. In an ideal scenario you don't need to talk to any other developer if you want to deploy the features your responsible for. I think this should also apply to application server software. The trend to shrink business application feature sets and increase those of application servers should stop. Or: administration features of application servers should be seperated and compatible to the different Java EE standards. A more moduralized approach would be desireable.Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com23tag:blogger.com,1999:blog-5701415790759755571.post-3617905798424326712011-11-12T12:12:00.072+01:002011-11-13T13:08:05.922+01:00Characteristics of successful developersMany blogs exist about personal (soft) characteristics of successful developers. Here is a short listing of some interesting links:<br />
<br />
<a href="http://www.supercoders.com.au/blog/50characteristicsofagreatsoftwaredeveloper.shtml">50 characteristics of a great software developer</a><br />
<a href="http://www.readwriteweb.com/archives/top_10_software_engineer_traits.php">Top 10 Traits of a Rockstar Software Engineer</a><br />
<a href="http://javablog.franksalinas.net/2009/05/09/five-essential-skills-for-software-developers/">Five essential skills for software developers</a><br />
<a href="http://agilemanifesto.org/">Manifesto for Agile Software Development<br />
</a><a href="http://manifesto.softwarecraftsmanship.org/">Manifesto for Software Craftsmanship</a><br />
<br />
This one blog now is my personal view on that very topic. It's of course subjective to my own history and environment and I don't claim that the list is complete. Also, I do not have the discipline to always have all those characteristics a 100% myself. We're all humans, so don't take them to serious :-) Last not least: success must not be the target of your work. The target is to work on your own virtues, some of those virtues are the topic of this blog.<br />
<br />
<a name='more'></a><b>The will to be good at something</b><br />
<br />
It's not easy to work as a developer! I say that for a couple of reasons that make our lifes a little harder compared to other professions. The fact - for instance - that the technology cycle in the IT world is very short, the actual knowledge becomes outdated in a few years. Therefore, we need to learn continuously - new things get important. To stay on top of things we really need the strong will to be good at our job. That's probably the most important characteristic to me: being an excellent knowledge worker with great technical abillities, and have the will to be that over decades!<br />
<br />
<b>To ask one's way</b><br />
<br />
Because it's impossible to know everything to do the job, it's absolutely necessary that a developer finds its way through a new topic. How I typically do that is I use google and I talk to other experts to find out what they think. "I did not know what to do!" is not an argument for me. 'Cause if I didn't know enough about that new technology yet, I spent the energy that's necessary to learn what I need to know to do the job. We need to work through the learning curve and make the last-ditch effort to get good at what we're doing!<br />
<b><br />
To make oneself useful</b><br />
<br />
If I have some time left because I completed my tasks earlier then expected, then: I take a coffee and play tabletop soccer. I take a rest. Afterwards I think about what I could do that helps the team to achieve its targets, 'cause some of my team mates probably didn't finish! (at least if I didn't met them at tabletop soccer) If everyone's finished then I think about improvements to the process or team organisation. I make myself usefull.<br />
<br />
<b>To care</b><br />
<br />
Some years ago I attended a software architecture course held by one of my idols <a href="http://www.bredemeyer.com/"> Dana Bredemeyer</a>. I had a discussion with him what it really takes to make a team successful or to be a successful team leader. He said: "Well, you need some people that really care!" I think there is a lot truth in that statement. If we do not care about quality, timelines, good team culture, respectful communication (!!), clean code, software-craftsmanship, if all this doen't matter to us, then I believe the probability is higher that we fail. <br />
<br />
<b>Being productive</b><br />
<br />
Peter Kruchten put it right in his <a href="http://www.ibm.com/developerworks/rational/library/4032.html">TAO for the software architect</a>:<br />
<br />
"Those who know don't talk.<br />
Those who talk don't know.<br />
Those who do not have a clue are still debating about the process.<br />
Those who know just do it."<br />
<br />
I am trying to be productive every week - at the end of a week I look back and I ask myself what I have produced. This could be paperwork, community days or (best!!) programming code.<br />
<br />
<b>Working solution-orientied</b><br />
<br />
In many situations where people had trouble to achieve it's targets I saw them debating about all the problems and the difficulties to solve the issue. They blamed each other and discussed about THE PAST. I am trying not to do that, I don't blame others, I don't just look at the difficulties. I am trying to suggest solutions instead! And yes, there is always a solution to a problem. Most of the times there are at least three solutions. <br />
<br />
<b>Be good with people</b><br />
<br />
Because our job typically involves to work in a (most wanted: cross-functional!) team, it's important that we're (more or less) good in dealing with other individuals. They have their own strengths and weaknesses, just like ourselves. It's important to treat all the team mates with respect, regardless of their technical competence or contributions. Of course, sometimes people deserve a clear statement, but try to do these things one-on-one. Make sure nobody looses his face. Attend the meetings at the coffe bar, be good at tabletop soccer and go out once in a while to have a beer with your team. You know what I'm talking about.<br />
<br />
That's it. I am looking forward to your thoughts and comments!Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com2tag:blogger.com,1999:blog-5701415790759755571.post-5979531714446575752011-10-25T15:38:00.005+02:002012-03-22T07:12:39.939+01:00Threading stories: volatile and synchronizedIn <a href="http://niklasschlimm.blogspot.com/2011/10/threading-stories-why-volatile-matters.html">my last blog</a> about the <code>volatile</code> modifier I have introduced a little program that illustrates the behaviour of <code>volatile</code> in a Java 6 (26) Hotspot VM. Since that day I had some interesting discussions that I wanted to share in this blog. It adds another valuable insights on the <code>volatile</code> modifier.<br />
<br />
<a name='more'></a>Here is my little program, which I have adopted a little to make it easier. My previous example was originally thought as a thread contention example, which will be the topic in one of my upcoming posts. <br />
<pre class="java" name="code">import java.util.Timer;
import java.util.TimerTask;
public class AnotherVolatileExampleA {
private volatile boolean expired = false;
private long counter = 0;
private Object mutex = new Object();
private class Worker implements Runnable {
@Override
public void run() {
synchronized (mutex) {
final Timer timer = new Timer();
timer.schedule(new TimerTask() {
public void run() {
expired = true;
System.out.println("Timer interrupted main thread ...");
timer.cancel();
}
}, 1000);
while (!expired) {
counter++; // do some work
}
System.out.println("Main thread was interrupted by timer ...");
};
}
}
public static void main(String[] args) throws InterruptedException {
AnotherVolatileExampleA volatileExample = new AnotherVolatileExampleA();
Thread thread1 = new Thread(volatileExample.new Worker(), "Worker-1");
thread1.start();
}
}
</pre><br />
Now, this program still behaves similar like <a href="http://niklasschlimm.blogspot.com/2011/10/threading-stories-why-volatile-matters.html">the one of my last blog</a>. With <code>volatile</code> in line 6 the result written to the console is:<br />
<br />
<pre class="java" name="code">Timer interrupted main thread ...
Main thread was interrupted by timer ...</pre><br />
Without <code>volatile</code> in line 6 the result is:<br />
<br />
<pre class="java" name="code">Timer interrupted main thread ...
</pre><br />
One question in a discussion was, why that happens although everything takes place in a <code>synchronized</code> block. The Java VM specification says the <code>synchronized</code> keywork garantees that (less formal!) a variable is written into the memory heap and is read from the memory heap (<a href="http://java.sun.com/docs/books/jvms/second_edition/html/Threads.doc.html#22253">read here</a>). Now, this is true, but it's missing the point that the thread only needs to read the variable ONCE within a single <code>synchronized</code> block. In the example above the expired variable is read once at the very first while loop. Afterwards the thread does not need to read the variable again. Consider this program:<br />
<br />
<pre class="java" name="code">import java.util.Timer;
import java.util.TimerTask;
public class AnotherVolatileExampleB {
private boolean expired = false;
private long counter = 0;
private Object mutex = new Object();
private class Worker implements Runnable {
@Override
public void run() {
final Timer timer = new Timer();
timer.schedule(new TimerTask() {
public void run() {
expired = true;
System.out.println("Timer interrupted main thread ...");
timer.cancel();
}
}, 1000);
boolean tmpExpired = false;
while (!tmpExpired) {
synchronized (mutex) {
tmpExpired = expired;
}
counter++; // do some work
}
System.out.println("Main thread was interrupted by timer ...");
}
}
public static void main(String[] args) throws InterruptedException {
AnotherVolatileExampleB volatileExample = new AnotherVolatileExampleB();
Thread thread1 = new Thread(volatileExample.new Worker(), "Worker-1");
thread1.start();
}
}
</pre><br />
In that case the <code>synchronized</code> block is within the while loop (lines 23-25) and the thread is now forced to re-read the expired variable from the main memory in each loop 'cause <code>synchronized</code> garantees to read from memory once (same applies to Java 5 locks). The result of that program will be as expected from a <code>synchronized</code> block:<br />
<br />
<pre class="java" name="code">Timer interrupted main thread ...
Main thread was interrupted by timer ...
</pre><br />
Therefore, if you wish to read a variable from memory in a <code>synchronized</code> block (or within a Java 5 lock), remember that the thread only garantees to read the variable once from the memory heap. The <code>volatile</code> modifier, on the other hand, always garantees a "memory heap read" (<a href="http://java.sun.com/docs/books/jvms/second_edition/html/Threads.doc.html#22258">see here</a>).Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com4tag:blogger.com,1999:blog-5701415790759755571.post-82958328794164822072011-10-19T08:16:00.035+02:002012-03-22T07:13:23.622+01:00Threading stories: Why volatile mattersMany years ago when I learned Java (in 2000) I was not so concerned about multithreading. In particular I wasn't concerned about the <code>volatile</code> modifier. I don't know why, but I never had problems without <code>volatile</code>, so maybe I thought it could not be so relevant. I've suddenly changed my mind when I first analyzed a wierd behaviour of an application that only existed when the application was deployed to a server JVM. Todays JVMs make a lot magic stuff to optimize runtime performance on server applications. In this blog I show you an example to get fimiliar with problems that arrize in multithreaded applications, when you don't recognize the importance of understanding how Java treats shared data in multithreaded programs.<br />
<br />
<a name='more'></a>This code snippet demonstrates why understanding <code>volatile</code> is important. Here is the code that you can use to play around. Notice in line 8 the <code>expired</code> variable is declared <code>volatile</code>:<br />
<pre class="java" name="code">import java.util.Timer;
import java.util.TimerTask;
public class VolatileExample {
private volatile boolean expired;
private long counter = 0;
private Object mutext = new Object();
@Override
public Object[] execute(Object... arguments) {
synchronized (mutext) {
expired = false;
final Timer timer = new Timer();
timer.schedule(new TimerTask() {
public void run() {
expired = true;
System.out.println("Timer interrupted main thread ...");
timer.cancel();
}
}, 1000);
while (!expired) {
counter++; // do some work
}
System.out.println("Main thread was interrupted by timer ...");
};
return new Object[] { counter, expired };
}
private class Worker implements Runnable {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
execute();
}
}
}
@SuppressWarnings("static-access")
public static void main(String[] args) throws InterruptedException {
VolatileExample volatileExample = new VolatileExample();
Thread thread1 = new Thread(volatileExample.new Worker(), "Worker-1");
Thread thread2 = new Thread(volatileExample.new Worker(), "Worker-2");
thread1.start();
thread2.start();
Thread.currentThread().sleep(60000);
thread1.interrupt();
thread2.interrupt();
}
}
</pre>Start that with Hotspot VM with <code>-server</code> option set. What you'll get is the following expected output:<br />
<pre class="java" name="code">Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
</pre>Now take out the <code>volatile</code> in line 8 above and restart, again with <code>-server</code> option set. What you should get is the following output:<br />
<pre class="java" name="code">Timer interrupted main thread ...
</pre>What happened? The Timer thread sets the <code>expired</code> flag to <code>true</code> but the main thread does not see the change. This is exactly what <code>volatile</code> is all about: it ensures that threads share the actual value of a specific variable. If you declare a variable as <code>volatile</code> all threads read that specific value from the memory heap. In the described example the timer thread set the expired value within the thread and this update was not reflected in the memory heap! Notice, that I cancel the timer thread when I set the <code>expired</code> variable to <code>true</code>. This causes the timer thread to die immediately after the <code>run()</code>-method is passed. The main memory heap may be updated now, but the worker thread continues to work on the 'cached' data in the thread memory.<br />
<br />
Next: now restart the code again without the <code>volatile</code> modifier. This time you set the <code>-client</code> JVM option (which is the default mode on Windows). The result is the following:<br />
<pre class="java" name="code">Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
Timer interrupted main thread ...
Main thread was interrupted by timer ...
</pre><br />
In the client mode the JVM obviously behaves different and does not optimize so aggressively like in server mode. So even if you missed out the <code>volatile</code> modifier, you may not necessarily see an error during development. The JVM options influence the way how strict the JVM optimzes your code. Without <code>volatile</code> it is not garanteed that data changes made by the timer thread are visible to the main thread. But in this case for instance everything still works OK in client mode, which shows that the result of your program relies on the JVM options set.Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com7tag:blogger.com,1999:blog-5701415790759755571.post-89788535524306042352011-09-30T16:10:00.014+02:002011-10-05T16:43:37.855+02:00Benchmark series on simple caching solutions in JavaCaching is a very common solution when you don't want to repeat CPU intense tasks. The last days I was benchmarking options to do caching with <code>ConcurrentHashMap</code>. In this blog I publish the first results. I have used <a href="http://www.javaspecialists.eu/archive/Issue124.html">Heinz Kabutz' Performance Checker</a> to do this. I also added some features based on my readings of <a href="http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html">this article series</a>.<br />
<br />
<a name='more'></a><br />
<br />
<div style="background-color: #cfe2f3; text-align: center;"><b>My Conclusions up front</b></div><br />
I am testing three different cache implementations: "Check null", "check map" and "putIfAbsent" cache. The code is listed below the results. Also, I am using three different cache sizes: 10 (small), 100000 (large) and 1000000 (very large) possible key values. Again: The 10 units cache can have 10 different key values. The 100000 units cache can have 100000 different key values etc.<br />
<br />
None of the implementation options show significant performance differences assuming cache size is equivalent. This is disappointing (!!) but also good to know. It's also a little surprising I think, 'cause for example "putIfAbsent" cache appears to be more complex then the others. All the solutions seem to decrease logarithmical in performance when the cache increases to very large sizes. The main reason for that should be the increased time for cache initialization, 'cause in my test harness I start with an empty cash and a fixed set of possible key values (i.e. 10, 100000 or 1000000). I'll come up with a benchmark series for fully initialized cache later.<br />
<br />
Knowing what I know now, I would carefully recommend to use the "putIfAbsent" cache solution. That's because it has equivalent performance but gives you great flexibility to design the behaviour of your cache in highly concurrent scenarios. See <a href="http://niklasschlimm.blogspot.com/2011/09/your-web-applications-work-by-sheer.html">the pattern solution of my last article </a>about multithreading as an example of more complex use cases.<br />
<br />
If you're interested in the test harness take a look at the implementations: <code><a href="https://github.com/nschlimm/playground/blob/master/webappbenchmarker-playground/src/main/java/com/schlimm/webappbenchmarker/command/std/CacheSolution_CheckMap.java">CacheSolution_CheckMap.java</a></code>, <code><a href="https://github.com/nschlimm/playground/blob/master/webappbenchmarker-playground/src/main/java/com/schlimm/webappbenchmarker/command/std/CacheSolution_CheckNull.java">CacheSolution_CheckNull.java</a></code> and <code><a href="https://github.com/nschlimm/playground/blob/master/webappbenchmarker-playground/src/main/java/com/schlimm/webappbenchmarker/command/std/CacheSolution_PutIfAbsent.java">CacheSolution_PutIfAbsent.java</a></code>. I appreciate your comments very much!<br />
<br />
My JVM was in mixed JIT mode and with <code>-server</code> option set. OK, let's go into this!<br />
<br />
<div style="background-color: #cfe2f3; text-align: center;"><b>Here are the results<br />
</b></div><br />
<div style="background-color: #cfe2f3;"><script src="https://gist.github.com/1253756.js">
</script></div><br />
5 test runs each / 500 ms each test run<br />
CL before/after = Classes Loaded before and after test harness<br />
JIT before/after = Total JIT time before and after test harness<br />
Small cache = 10 units<br />
Large cache 100000 units<br />
Very large cache = 1000000 units<br />
<br />
<div style="background-color: #cfe2f3; text-align: center;"><b>Check null cache</b></div><br />
<script src="https://gist.github.com/1253758.js">
</script><br />
<br />
<div style="background-color: #cfe2f3; text-align: center;"><b>Check map cache</b></div><br />
<script src="https://gist.github.com/1253760.js">
</script><br />
<br />
<div style="background-color: #cfe2f3; text-align: center;"><b>The putIfAbsent cache</b></div><br />
<script src="https://gist.github.com/1253766.js">
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com3tag:blogger.com,1999:blog-5701415790759755571.post-16640061585668344282011-09-09T22:01:00.005+02:002012-03-22T07:14:16.683+01:00Your web applications work - by sheer coincidence!This blog describes a solution to a typical concurrency problem in a web application environment, and it illustrates that you - in all likelihood - cannot be sure that your application is thread-safe. It just works - by sheer coincidence.<br />
<br />
<a name='more'></a>Last week we had a severe problem in a critical web module of our production environment. The problem was that we had to restart a server at a time where we had very high user and transaction volumes (we do ~500.000 transactions in that module per business day). The server shutdown deleted our XSL template cache. As a consequence many threads tried to recompile the templates concurrently after restart. This in turn introduced a (CPU) system overload issue. We could only restart the server by blocking the incoming requests at the web server level. In fact we only allowed some threads to bypass the web server and to enter the application server. After some warmup time we opened the web server and the system started to work at an acceptable CPU load.<br />
<br />
We've looked at the application code that caused the issue and decided to implement an intelligent concurrency pattern that fulfills following requirements:<br />
<br />
- the number of threads that perform a CPU intense task concurrently (like XSL template compilation) should be limited to a configurable size (=> avoid CPU overload on startup in user load peek times)<br />
- cache the result of each CPU intense task so that it will only execute once (we had that already in our error prone solution)<br />
- enable the system to determine whether the CPU intense task needs to execute again (=> if the application was redeployed the templates need to recompile)<br />
<br />
I spare me the effort (and the painfullness) to post the old non-safe code. The bugs in that code were:<br />
<br />
<b>firstly </b>- it did not limit the number of threads allowed to compile templates<br />
<b>secondly </b>- it did not ensure that only one thread compiles a specific template at a given time (e.g. our startup page) - instead many threads tried to compile the <b>same</b> template concurrently - this one is critical!<br />
<br />
That all being said, here comes the solution to such a "too-many-threads concurrency issue". Here's the code and I also added a class diagram 'cause I believe this is a common situation in web applications.<br />
<br />
The following snippet shows our new <code>HTMLDocumentGenerator</code>, its responsibility is to create and cache <code>ConcurrentTransformer</code> instances, one for each XSL document. The responsibility of <code>ConcurrentTransformer</code> is to compile a XSL template and to store the result in its <code>template</code> variable. <code>HTMLDocumentGenerator</code> owns the cache (Line 3), defines a limit of threads that can compile XSL documents (Line 5) and declares a <code>Semaphore</code> that implements a "thread bouncer" (Line 6). The <code>createDocument</code>-Method (Lines 10-20) creates and caches the new <code>ConcurrentTransformer</code> instances for each XSL document. Example: <code>/start/index.xsl</code> will have its own <code>ConcurrentTransformer</code> instance and <code>/start/main_menu.xsl</code> will also have its own unique instance.<br />
<br />
<script src="https://gist.github.com/1206220.js">
</script><br />
<br />
Notice that we use <code>ConcurrentHashmap.putIfAbsent()</code> which allows cache lookup and cached object creation in a single step. This is equivalent to:<br />
<br />
<pre class="java" name="code">if (!map.containsKey(key))
return map.put(key, value);
else
return map.get(key);
</pre><br />
This is a perfect approach in a multithreaded high volume application. You could do the same with owned locks or <code>synchronized</code> blocks, but it will hardly be so safe (and fast!) like the example above. You will understand my statement if you look at the implementation of <code>ConcurrentHashmap</code>. It locks segments internally and not the whole table! This allows very high volumes of concurrent access without lock contention. In Line 18 we call the method <code>generateTemplate</code> that actually performs the CPU intense task (template compilation). <br />
<br />
In Line 5 we declared a <code>Semaphore</code> which acts as a bouncer for the CPU intense code sections. Using this class you can control the number of concurrent threads that perform a certain task concurrently. It's important to declare the <code>Semaphore</code> in the <code>HTMLDocumentGenerator</code> class, 'cause the thread limit applies to all cached <code>ConcurrentTransformer</code> instances. <br />
<br />
Now let's look at the <code>ConcurrentTransformer</code> that I declared as a protected inner class of <code>HTMLDocumentGenerator</code>. Using an inner class <code>ConcurrentTransformer</code> instances have immediate access to member variables of <code>HTMLDocumentGenerator</code>. There is one <code>ConcurrentTransformer</code> for each unique XSL template. <br />
<br />
<script src="https://gist.github.com/1206265.js">
</script> <br />
<br />
Let's go through <code>generateTemplate</code> step-by-step:<br />
<br />
<b>Line 9</b>: check if thread needs to compile this XSL template. If <code>false</code>, return the compiled template. <code>mustCompile</code> returns <code>true</code> if for example the <code>template</code> variable is <code>null</code>, which means the template wasn't compiled yet. The first thread that enters <code>generateTemplate</code> will get <code>mustCompile() = true</code> 'cause the <code>ConcurrentTransformer</code> was just created by that thread and the <code>template</code> variable is <code>null</code>.<br />
<b>Line 12</b>: Block all subsequent threads 'cause only ONE thread can compile THIS XSL template at a given time.<br />
<b>Line 14</b>: check again if thread needs to compile the XSL template. Sounds weird? Imagine a second thread that was blocked at Line 12 'cause the first call to <code>mustCompile</code> returned <code>true</code>. This thread does not need to compile again 'cause the other (faster/first) thread already compiled the template. <br />
<b>Line 17</b>: check if thread exceeds the permitted number of threads allowed to do compile jobs. Because the <code>bouncer</code> was defined in the <code>HTMLDocumentGenerator</code> only a limited number of threads will enter the next code block. This actually was the tricky part of the solution: Lock threads that want to compile a specific template but also ensure that the total number of active compilation tasks does not exceed a defined limit! Because the <code>bouncer</code> applies to all cached <code>ConcurrentTransformer</code> instances this is possible.<br />
<b>Line 19-25</b>: Do the actual compilation work which is the CPU intense part that aused the system overload.<br />
<b>Line 27</b>: Release a permit to allow other thread (in a different <code>ConcurrentTransformer</code> instance) to enter the compilation code.<br />
<b>Line 30</b>: Release the lock so that other threads waiting for that XSL document can continue their processing. (they will return the template that was just compiled, see line 14)<br />
<br />
Done! This solution will enforce that (1) only one thread will try to compile a specific XSL document and that (2) only a limitted amount of threads can do compilation work. <br />
<br />
Here is the class diagram that shows the pattern-style structure of the solution:<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFmCKz7XxfNzv05OKhTyiaAViFGYRsJAIYYTv69Vss6Gp5CDSmoRhgg8JnWznqg90KVHU5kNS7LcWel6iWHebdBavjOAlUNiU3N0os4xdvh7cRSbGmEmgaajINOV_Nze6JBKYfhlHLPvw/s1600/ConcurrentTrasnformer.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFmCKz7XxfNzv05OKhTyiaAViFGYRsJAIYYTv69Vss6Gp5CDSmoRhgg8JnWznqg90KVHU5kNS7LcWel6iWHebdBavjOAlUNiU3N0os4xdvh7cRSbGmEmgaajINOV_Nze6JBKYfhlHLPvw/s320/ConcurrentTrasnformer.JPG" width="320" /></a></div><b><br />
<br />
Sleeper bugs and why systems work by coincidence</b><br />
<br />
Some weeks ago I met <a href="http://www.javaspecialists.eu/">Dr. Heinz Kabbutz</a>. I attended his <a href="http://www.javaspecialists.eu/courses/master.jsp">Java Specialist Master Course</a> and he opened my eyes for problems like this. Multi-threading was his first lesson and he started by saying something like this: "Your web applications are not thread safe, they just work - by sheer coincidence!" Although its a provocative statement, he is not wrong by saying that ... The described concurrency bug was in our code for a _long_ time, say for 8 years? It just didn't show up. What changed? We migrated from WebSphere Application Server 6 to 7. And we decided to use a new XSL parser, optimized for z/OS. The old XSL parser performed good at compilation time, but the compilation result did not perform so well. The new XSL parser promised an optimized compilation result. But obviously it did take more CPU during compilation. Now, since the compilation was CPU intense with the new parser our consurrency bug turned out to be a big problem suddenly. I'd call that a "sleeper bug". It is there for years, and suddenly it sabotages your system. I believe there are many bugs like this in todays Java applications, and I believe it's not too provocative to say that you will also have some sleeper bugs in your systems. Your systems work - but by sheer coincidence! <br />
<br />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js">
</script><br />
<br />
<script src="http://gist.github.com/raw/454771/gist-line-number-hack.js">
</script><br />
<br />
<script type="text/javascript">
addLineNumbersToAllGists()
</script>Niklas Schlimmhttp://www.blogger.com/profile/12402045792243894660noreply@blogger.com7