Searching in a search: let’s check Elasticsearch
This is one of the biggest open-source Java projects. Many enterprises, including GitHub, Netflix, and Amazon, use Elasticsearch. It’s been six years since we’ve checked the project, so what new bugs could emerge in such a long time?
What will we discuss?
As the introduction goes, we checked the project six (!) years ago. You can read that article here.
The analyzer has issued thousands of warnings, but to avoid turning this article into a novel on how to configure and review warnings, let’s focus on the ones that caught our eye during a quick look.
First of all, a quick refresher on what Elastricsearch is: it’s one of the most successful search engines trusted by both enterprises and startups.
The flexibility of the engine doesn’t end there: it can run on a laptop and enable you to search within a single folder, or scale across multiple distributed servers and process petabytes of data.
Its large codebase (three million lines of Java code) that powers the project is also worth mentioning. Let’s move on to the check.
Disclaimer: code formatting (indentations, line breaks) may slightly differ from the original, but the main idea of it remains completely intact.
Typos
We’ll start with one of the developers’ worst enemies: typos.
Field by field
Do you write getters often? We hope not, because it’s one of the least interesting things to do. Code generators or handy tools like Lombok save us from boilerplate code.
private final boolean isRunning;
private final boolean isAsync;
….
public boolean isRunning() {
return isRunning;
}
public boolean isAsync() {
return isRunning;
}
Let’s look at the `isRunning` and `isAsync` variables, or rather at their getters. The `#isRunning()` getter returns the `isRunning` value, and `#isAsync()` returns… the same `isRunning` value. The second getter should return `isAsync`, but the code author made a simple but dangerous typo.
The PVS-Studio warning: V6091 Suspicious getter implementation. The ‘isAsync’ field should probably be returned instead. EsqlQueryResponse.java 180
This wasn’t the only instance, we’ve found another getter that doesn’t work as expected:
boolean isDirectory;
boolean isSymbolicLink;
....
@Override
public boolean isDirectory() {
return isDirectory;
}
@Override
public boolean isSymbolicLink() {
return isDirectory;
}
Here, the code author mixed up the `isDirectory` variable with `isSymbolicLink`.
PVS-Studio issued the same warning: V6091 Suspicious getter implementation. The ‘isSymbolicLink’ field should probably be returned instead. DockerFileAttributes.java 67
“equals” == “==”
Newcomers to Java may ask a fairly logical question, “What is the difference between `equals` and `==`?” Let’s look at an example:
private void assertValidJsonInput(String content) {
if (testResponse && ("js" == language || "console-result" == language)
&& null == skip) {
.....
}
}
The PVS-Studio analyzer issues the following warnings for the code above:
V6013 Strings ‘“js”’ and ‘language’ are compared by reference. Possibly an equality comparison was intended. SnippetBuilder.java 232
V6013 Strings ‘“console-result”’ and ‘language’ are compared by reference. Possibly an equality comparison was intended. SnippetBuilder.java 232
One might ask, “What’s wrong, though? I’ve written code like this before, and it worked fine”. The reason it may seem to work lies in the string pool, which reduces the number of string instances to avoid recurrent memory allocation. However, there’s no guarantee that any given string will be in the pool. It’s possible for two strings with the same content to refer to different instances, and a comparison by reference (`==`) will return `false`. So, using `equals` to compare strings is recommended.
While reference comparison may work with strings, it will never yield the correct result with collections (unless we include comparing the same instance to itself).
private final List<Pipe> values;
....
@Override
public final Pipe resolveAttributes(AttributeResolver resolver) {
List<Pipe> newValues = new ArrayList<>(values.size());
for (Pipe v : values) {
newValues.add(v.resolveAttributes(resolver));
}
if (newValues == values) { // <=
return this;
}
return replaceChildrenSameSize(newValues);
}
Here, the newly created `newValues` object is compared to the `values` field, which is always `false`. Most likely, the code author wanted to check whether the new list contains the same items as the existing one, so as not to overwrite it unnecessarily.
For this error, the analyzer issued the following warning: V6013 Objects ‘newValues’ and ‘values’ are compared by reference. Possibly an equality comparison was intended. ConcatFunctionPipe.java 41
I’m right, and I’ve left it behind
Code completion is deeply rooted in the development process. Along with the benefits, it also brings a hefty number of new typos. Let’s take a look at one of those:
public void testReadSlices() throws IOException {
....
try (StreamInput input1 = bytesReference.streamInput();
StreamInput input2 = bytesReference.streamInput()) {
for (int i = 0; i < refs; i++) {
boolean sliceLeft = randomBoolean();
BytesReference left = sliceLeft ?
input1.readSlicedBytesReference()
: input1.readBytesReference();
if (sliceLeft && bytesReference.hasArray()) {
assertSame(left.array(), bytesReference.array()); // <=
}
boolean sliceRight = randomBoolean();
BytesReference right = sliceRight ?
input2.readSlicedBytesReference()
: input2.readBytesReference();
assertEquals(left, right);
if (sliceRight && bytesReference.hasArray()) {
assertSame(right.array(), right.array()); // <=
}
}
}
If we compare the first highlighted line of code with the second, we can assume that the second one implied the following:
assertSame(right.array(), bytesReference.array());
However, the original code compares the result of the `array()` function for the same `right` object.
The PVS-Studio warning: V6009 Function ‘assertSame’ receives an odd argument. The ‘right.array()’ argument was passed several times. AbstractBytesReferenceTestCase.java 713
Let’s say it’s equal
We’ve already discussed the difference between `equals` and `==`. Now, we’ll take a closer look at `equals`, or rather dive into possible errors in its implementation:
@Override
public boolean equals(Object obj) {
....
KeyedFilter other = (KeyedFilter) obj;
return Objects.equals(keys, other.keys)
&& Objects.equals(timestamp, other.timestamp)
&& Objects.equals(tiebreaker, other.tiebreaker)
&& Objects.equals(child(), other.child())
&& isMissingEventFilter == isMissingEventFilter; // <=
}
Developers forgot to put `other` before the second `isMissingEventFilter` here. Due to this small typo, the `isMissingEventFilter` parameter is completely ignored in the comparison.
Here’s the PVS-Studio warning: V6001 There are identical sub-expressions ‘isMissingEventFilter’ to the left and to the right of the ‘==’ operator. KeyedFilter.java 116
Take a look at another typo in the `equals` implementation:
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
IndexError that = (IndexError) o;
return indexName.equals(that.indexName)
&& Arrays.equals(shardIds, that.shardIds)
&& errorType == that.errorType
&& message.equals(that.message)
&& stallTimeSeconds == stallTimeSeconds; // <=
}
Similar to the previous `equals`, here they forgot to add the `that` prefix to the second `stallTimeSeconds`.
Notably, in both cases, the typo occurred in the last line, which is a real development phenomenon. You can read the article on this “last line effect” if you want to dig deeper.
The PVS-Studio analyzer issued the following warning for this code: V6001 There are identical sub-expressions ‘stallTimeSeconds’ to the left and to the right of the ‘==’ operator. IndexError.java 147
Do you like math? What about math in code?
Regardless of your answers, wouldn’t you agree that it’s easy to make a typo in such math code? Take a look at one of such cases:
private static LinearRing parseLinearRing(
ByteBuffer byteBuffer,
boolean hasZ, boolean coerce
) {
....
double[] lons = new double[length];
double[] lats = new double[length];
....
if (linearRingNeedsCoerced(lats, lons, alts, coerce)) { // <=
lons = coerce(lons);
lats = coerce(lats);
if (hasZ) {
alts = coerce(alts);
}
}
....
}
What’s wrong with this code, though? To find the answer, let’s take a look at the arguments of the `linearRingNeedsCoerced` method.
private static boolean linearRingNeedsCoerced(
double[] lons, double[] lats,
double[] alts,
boolean coerce
) {
assert lats.length == lons.length
&& (alts == null || alts.length == lats.length);
assert lats.length > 0;
if (coerce == false) {
return false;
}
final int last = lons.length - 1;
return lons[0] != lons[last]
|| lats[0] != lats[last]
|| (alts != null && alts[0] != alts[last]);
}
After comparing the passed and expected arguments, we can conclude that devs mixed up `lons` and `lats`. By chance, this code doesn’t cause an error. However, any change in how these arguments are checked could cause the method to behave incorrectly.
The PVS-Studio warning: V6029 Possible incorrect order of arguments passed to method: ‘lats’, ‘lons’. WellKnownBinary.java 370
Null is null. What’s next, though?
Let’s look at this code fragment:
public static void assertDeepEquals(ElasticsearchException expected,
ElasticsearchException actual) {
do {
if (expected == null) {
assertNull(actual);
} else {
assertNotNull(actual);
}
assertEquals(expected.getMessage(), actual.getMessage());
....
}
....
}
When `expected == null`, it’s checked whether `actual` is also `null`. That should’ve been the end of the similarity check, but it wasn’t. Object properties are checked next, and if `actual` proves to be `null`, a null-pointer dereference is possible.
This is the warning the analyzer issues for the code snippet: V6008 Potential null dereference of ‘expected’. ElasticsearchExceptionTests.java 1354
Shall we check? Nope
There’s a lot to keep track of during the development process, so it’s easy to miss a little thing like the one in the code below:
private FinishFunction(....) {
....
for (int p = 1; p < fn.getParameters().size(); p++) {
VariableElement param = fn.getParameters().get(p);
if (p == 0) { // <=
if (false == TypeName.get(param.asType()).equals(workType)) {
throw new IllegalArgumentException(
"First argument of "
+ declarationType + "#" + fn.getSimpleName()
+ " must have type " + workType
);
}
continue;
}
....
}
invocationPattern = pattern.append(")").toString();
}
Here, `for` starts at index one, but the body includes a type check for the first element. Most likely, the code author just forgot to adjust the loop boundaries — resulting in the check that will never be executed.
The PVS-Studio analyzer always remembers such subtleties and gives a warning if a developer forgets: V6007 Expression ‘p == 0’ is always false. MvEvaluatorImplementer.java 456
A committer of commits once committed an overcommit
Without further explanation, let’s take a look at the code below:
public void testDLS() throws Exception {
....
int numDocs = scaledRandomIntBetween(32, 128);
int commitAfter = scaledRandomIntBetween(1, numDocs);
....
for (int doc = 1; doc <= numDocs; doc++) {
....
if (doc % 11 == 0) {
iw.deleteDocuments(new Term("id", id));
} else {
if (commitAfter % commitAfter == 0) { // <=
iw.commit();
}
valuesHitCount[valueIndex]++;
}
}
....
}
In this snippet, you may notice that the remainder of dividing `commitAfter` by itself (which is always zero) is compared to 0. As a result, this condition is always true, which causes `iw.commit()` to be executed after every document is created.
PVS-Studio issues the following warning for the suspicious operation: V6001 There are identical sub-expressions ‘commitAfter’ to the left and to the right of the ‘%’ operator. SecurityIndexReaderWrapperIntegrationTests.java 157
What’s going on in this code? The `commitAfter` variable is randomly generated in the range from 1 to `numDocs`. Then, the loop iterates over all numbers from 1 to `numDocs`, and if the remainder of dividing `commitAfter` is zero, the created documents are committed. The division remainder here is responsible for committing every `commitAfter` elements. The fixed condition looks like this:
doc % commitAfter == 0
Never let it go
Let’s look at two code fragments:
@Override
public void setAutoCommit(boolean autoCommit) throws SQLException {
checkOpen();
if (autoCommit == false) {
new SQLFeatureNotSupportedException("Non auto-commit is not supported");
}
}
private String getProperty(String propertyName) {
final String[] settings = getConnectString().split(";");
for (int i = 0; i < settings.length; i++) {
String setting = settings[i].trim();
if (setting.length() > 0) {
final int idx = setting.indexOf('=');
if (idx == -1 || idx == 0 || idx == settings[i].length() - 1) {
new IllegalArgumentException("Invalid connection string: " // <=
+ getConnectString());
}
if (propertyName.equals(setting.substring(0, idx))) {
return setting.substring(idx + 1);
}
}
}
return null;
}
You’ve probably noticed that both fragments have one thing in common: an exception is created but thrown (away), so it’s left there high and dry until the JIT compiler realizes that line is dead code.
It’s worth noting that the first error has been around for a long time — we already covered it in the first Elasticsearch check.
PVS-Studio issued two warnings respectively:
V6006 The object was created but it is not being used. The ‘throw’ keyword could be missing. JdbcConnection.java 93
V6006 The object was created but it is not being used. The ‘throw’ keyword could be missing. AzureStorageSettings.java 352
To become the best version just a copy of yourself
Let’s look at an error caused by a variable overlap:
private final PatternSet patternSet = new PatternSet().include("**/*.class");
....
public void setPatternSet(PatternSet patternSet) {
patternSet.copyFrom(patternSet);
}
The code author likely intended the above setter to copy the state of the passed `patternSet` to the class field, but the object ends up copying its own state due to the missing `this` keyword.
Let’s look at the warning that the analyzer issued for this suspicious method call: V6100 An object is used as an argument to its own method. Consider checking the first actual argument of the ‘copyFrom’ method. CheckForbiddenApisTask.java 157
Ctrl+C, Ctrl+V, Ctrl+V, Ctrl+V, Ctr… Oops
At first glance, the following code contains a minor error: a redundant check for `null`. Is that it, though? I invite you to try finding a more serious typo — the name of the section is the clue.
private void test(Snippet test) {
setupCurrent(test);
if (test.continued()) {
/* Catch some difficult to debug errors with // TEST[continued]
* and throw a helpful error message. */
if (previousTest == null
|| previousTest.path().equals(test.path()) == false) {
throw new InvalidUserDataException("// TEST[continued] "
+ "cannot be on first snippet in a file: " + test);
}
if (previousTest != null && previousTest.testSetup()) {
throw new InvalidUserDataException("// TEST[continued] "
+ "cannot immediately follow // TESTSETUP: " + test);
}
if (previousTest != null && previousTest.testSetup()) {
throw new InvalidUserDataException("// TEST[continued] "
+ "cannot immediately follow // TEARDOWN: " + test);
}
}
....
}
It’s great if you caught the typo, and if you didn’t, we’ll walk through it.
Take a look at the conditions in the last two `if` blocks — they’re identical. Could it be done on purpose, though? Now, let’s look at the `then` blocks — specifically at the exception messages that are thrown there. They’re different. Most likely, the code author wanted to check two different conditions, but the copy-paste turned the code into what it is now.
We can assume that the second condition should’ve looked like this (let’s not forget that we can remove the `null` check):
if (previousTest.testTearDown()) {
....
}
Let’s also take a look at the warnings the PVS-Studio analyzer issued for this code snippet:
V6007 Expression ‘previousTest != null’ is always true. RestTestsFromDocSnippetTask.java 247
V6007 Expression ‘previousTest != null’ is always true. RestTestsFromDocSnippetTask.java 249
A check, and another check, and another one
It’s worth checking the input data, especially if it came from a user. In this section, we’ll look at the checks that have gone too far.
A token or a token?
Let’s start with a strange check of a current token.
protected BulkByScrollTask.Status doParseInstance(
XContentParser parser
) throws IOException {
XContentParser.Token token;
if (parser.currentToken() == XContentParser.Token.START_OBJECT) {
token = parser.nextToken(); // <=
} else {
token = parser.nextToken(); // <=
}
ensureExpectedToken(XContentParser.Token.START_OBJECT, token, parser);
token = parser.nextToken();
ensureExpectedToken(XContentParser.Token.FIELD_NAME, token, parser);
return innerParseStatus(parser);
}
We’ll break down the above method. The `parser` object is passed in, we check its current token. If it’s the start of the object, we move to the next one, but if it’s not… we do the same! The most interesting thing is that right after that, we check whether this token is also the start of an object!
Could this odd behavior be the result of multiple people editing the code? Unlikely. This method showed up in a single commit and has never been changed since.
The PVS-Studio warning: V6004 The ‘then’ statement is equivalent to the ‘else’ statement. BulkByScrollTaskStatusTests.java 189
You can’t see the lines. Now you still can’t see them
Let’s take a look at another redundant check:
@Override
protected StringBuilder contentToWKT() {
final StringBuilder sb = new StringBuilder();
if (lines.isEmpty()) { // <=
sb.append(GeoWKTParser.EMPTY);
} else {
sb.append(GeoWKTParser.LPAREN);
if (lines.size() > 0) { // <=
sb.append(ShapeBuilder
.coordinateListToWKT(lines.get(0).coordinates)
);
}
for (int i = 1; i < lines.size(); ++i) {
sb.append(GeoWKTParser.COMMA);
sb.append(ShapeBuilder
.coordinateListToWKT(lines.get(i).coordinates)
);
}
sb.append(GeoWKTParser.RPAREN);
}
return sb;
}
The code contains an obviously redundant `lines.size() > 0` check. Since we’ve already checked whether `lines` is empty, this execution branch will only be executed if there’s at least one element.
The PVS-Studio warning: V6007 Expression ‘lines.size() > 0’ is always true. MultiLineStringBuilder.java 81
A parenthesis? Are you sure about that?
Now we’ll poke around in the code a bit. First, let’s take a look at the following method:
private static List<Coordinate> parseCoordinateList(
StreamTokenizer stream,
final boolean ignoreZValue,
final boolean coerce
) throws IOException, ElasticsearchParseException {
....
while (nextCloserOrComma(stream).equals(COMMA)) {
....
if (isOpenParen && nextCloser(stream).equals(RPAREN) == false) { // <=
throw new ElasticsearchParseException(
"expected: " + RPAREN
+ " but found: " + tokenString(stream),
stream.lineno()
);
}
}
return coordinates.build();
}
Unless you know the Elasticsearch code by heart, the error here probably won’t be obvious.
The PVS-Studio analyzer warning: V6007 Expression ‘nextCloser(stream).equals(RPAREN) == false’ is always false. GeoWKTParser.java 164
Now we know what the error is, but why is the result of the `nextCloser` function always equivalent to `RPAREN`? Let’s take a look at the method body:
private static String nextCloser(StreamTokenizer stream)
throws IOException, ElasticsearchParseException {
if (nextWord(stream).equals(RPAREN)) {
return RPAREN;
}
throw new ElasticsearchParseException(
"expected: " + RPAREN
+ " but found: " + tokenString(stream),
stream.lineno()
);
}
Turns out that `nextCloser` either returns `RPAREN` or throws an exception! Also, note that it’s identical to the one in the method that triggered the warning.
We can simplify the code:
private static List<Coordinate> parseCoordinateList(
StreamTokenizer stream,
final boolean ignoreZValue,
final boolean coerce
) throws IOException, ElasticsearchParseException {
....
while (nextCloserOrComma(stream).equals(COMMA)) {
....
if (isOpenParen) {
nextCloser(stream);
}
}
return coordinates.build();
}
Emptiness x2
Here’s an easy way to combine lists:
public static <T> List<T> combine(
List<? extends T> left,
List<? extends T> right
) {
if (right.isEmpty()) {
return (List<T>) left;
}
if (left.isEmpty()) {
return (List<T>) right;
}
List<T> list = new ArrayList<>(left.size() + right.size());
if (left.isEmpty() == false) { // <=
list.addAll(left);
}
if (right.isEmpty() == false) { // <=
list.addAll(right);
}
return list;
}
Having checked whether the lists are empty, we don’t need to do it again when filling the union. I should note that this can’t be called an error, but it’s possible to write simpler and more compact code here.
PVS-Studio detected redundant checks and issued two warnings:
V6007 Expression ‘left.isEmpty() == false’ is always true. CollectionUtils.java 33
V6007 Expression ‘right.isEmpty() == false’ is always true. CollectionUtils.java 36
Do you have a substring? Do you? Let me check!
Let’s speed up and look at another not-so-obvious case:
@TaskAction
public void checkDependencies() {
....
for (File file : licensesDirAsFile.listFiles()) {
String name = file.getName();
if (name.endsWith("-LICENSE") || name.endsWith("-LICENSE.txt")) {
// TODO: why do we support suffix of LICENSE *and* LICENSE.txt??
licenses.put(name, false);
} else if (name.contains("-NOTICE")
|| name.contains("-NOTICE.txt")) { // <=
notices.put(name, false);
} else if (name.contains("-SOURCES")
|| name.contains("-SOURCES.txt")) { // <=
sources.put(name, false);
}
}
....
}
This code works correctly, but we also can simplify it by removing redundant checks. Let’s walk through the logic, and to make sure there’s no confusion, we’ll use a specific example.
We have the `”FILENAME-NOTICE.txt”` string. Does it contain the `”-NOTICE.txt”` substring? And what about `”-NOTICE “`? Both checks are true. So, there’s no point in checking `”-NOTICE.txt”` and `”-NOTICE”` separately.
We have the exact same situation with the `”-SOURCES.txt”` and `”-SOURCES”`.
Take a look at the analyzer warnings:
V6007 Expression ‘name.contains(“-NOTICE.txt”)’ is always false. DependencyLicensesTask.java 228
V6007 Expression ‘name.contains(“-SOURCES.txt”)’ is always false. DependencyLicensesTask.java 230
That’s the way the cookie crumbles
To wrap up the section, let’s take a look at the code from one of the many tests:
public void testMinimumPerNode() {
int negativeShardsPerNode = between(-50_000, 0);
try {
if (frequently()) {
clusterAdmin().prepareUpdateSettings(....)
.setPersistentSettings(
Settings.builder()
.put(shardsPerNodeKey, negativeShardsPerNode)
.build()
)
.get();
} else {
clusterAdmin().prepareUpdateSettings(....)
.setPersistentSettings(
Settings.builder()
.put(shardsPerNodeKey, negativeShardsPerNode)
.build()
)
.get();
}
fail("should not be able to set negative shards per node");
} catch (IllegalArgumentException ex) {
....
}
}
The PVS-Studio warning: V6004 The ‘then’ statement is equivalent to the ‘else’ statement. ClusterShardLimitIT.java 52
To see why `then` is equivalent to `else`, we’ll turn to Git History. The method appeared about six years ago. The `If`-`else` statement still made sense back then: settings could be set in two ways: by default, and by `setTransientSettings`. However, some time later, developers marked the latter method as deprecated. They replaced it with `setPersistentSettings`, but failed to notice that `then` and `else` became identical.
Is that how it works?
Let’s start by looking at a method for getting a random number with a random type (`double` or `float`):
protected Number randomNumber() {
/*
* The source parser and doc values round trip will both reduce
* the precision to 32 bits if the value is more precise.
* randomDoubleBetween will smear the values out across a wide
* range of valid values.
*/
return randomBoolean() ?
randomDoubleBetween(-Float.MAX_VALUE, Float.MAX_VALUE, true)
: randomFloat();
}
It seems that nothing unusual is happening: the method randomly returns either `float` or `double`.
The PVS-Studio warning: V6088 Result of this expression will be implicitly cast to ‘double’. Check if program logic handles it correctly. ScaledFloatFieldMapperTests.java 505
But how!? Could the issue be that `#randomFloat()` actually returns `double`? We’ll take a look at it, too:
public static float randomFloat() {
return random().nextFloat();
}
Everything’s correct here as well. Could there be something wrong with the return type of `#randomDoubleBetween`?
public static double randomDoubleBetween(
double start, double end,
boolean lowerInclusive
) {
double result = 0.0;
....
return result;
}
Is the analyzer wrong? We can assure you that it’s not. Assuming the & return type is a reference, the primitive value of the ternary operator is evaluated first, and it undergoes conversion in numeric contexts (`float` is cast to `double`). Only then the resulting value is boxed.
It’s also worth noting that Elasticseach has plenty of similar errors. But this isn’t about sloppy coding, the behavior is indeed unpredictable. A static analyzer that knows about such subtleties in implementation can save the day.
To fix this, we can convert the ternary operator to the regular `if`-`else`.
if (randomBoolean()) {
return randomDoubleBetween(-Float.MAX_VALUE, Float.MAX_VALUE, true);
} else {
return randomFloat();
}
Here’s an example of a similar error:
public static Number truncate(Number n, Number precision) {
....
return n instanceof Float ? result.floatValue() : result;
}
The PVS-Studio analyzer issues the following warning: V6088 Result of this expression will be implicitly cast to ‘double’. Check if program logic handles it correctly. Maths.java 122
Let’s summarize
Kudos to the devs for developing and maintaining a project as large as Elasticsearch — with over three million lines of Java code, it’s a tremendous undertaking. However, we’re all human, and we all make mistakes that we can sometimes overlook.
In this article, we’ve shown only a small part of all the analyzer warnings, but the project has more. So, to avoid breaking the review into too many parts, we’ve highlighted the most noteworthy bugs (in our opinion).
If you’d like to try PVS-Studio, you can download the trial version. For open-source projects, we have a free licensing option.