Page tree
Skip to end of metadata
Go to start of metadata


CDAP-4790 - Getting issue details... STATUS

The issue
The following code has different behavior on distributed CDAP vs SDK, in the case that the `columnsToUpdate` is an empty array of byte[].

// suppose we have logic to retrieve a certain set of columns, and then re-write them based upon the existing values
Table myTable = getTable("myTable");
byte[][] columnsToUpdate = computeColumnsToUpdate();
Row result = myTable.get(row, columnsToUpdate);
// update the columns, based upon the existing values in the 'result'

In the case that columnsToUpdate.length is 0, HBase implementation of Table retrieves all columns of the row, which can be unexpected, because then the logic in the above code would go and update all of the columns in the row. This is inconsistent with LevelDB and InMemory implementation of Table, which return an empty result (no columns), if `columnsToUpdate` is length 0.

Either one of two things need to happen:
1) LevelDB and InMemory implementation of Table need to be updated to return all rows when columns is empty.
2) HBase implementation of Table needs to be updated to return zero rows when columns is empty.

I am suggesting the second option, because the user may not always know the length of the requested columns (it may be computed programmatically as in the code snippet above), and it would be unexpected to retrieve all rows when requesting zero columns.
The request all columns, the user would still be able to use the `Table#get(byte[] row)` API.
This would also make it consistent with the `Table#delete(byte[], byte[][] columns)` API which deletes nothing if columns.length is 0.

Other Changes
If we make the 2nd change, then the semantics of the Get.java and Delete.java class will also have to be changed for consistency, when the columns being passed in are empty.
The Put.java class API is different in that it has no way to add more than one column at a time, and so no changes will be made to that. (It is a bit strange that the APIs compared to the Get/Delete vary like that, though).

// When `columns` is empty, the semantics of the following constructors is that every column will be retrieved or deleted.
// The proposed change will make it so that empty columns will mean no columns are retrieved or deleted.


// Get.java
 
public Get(byte[] row, byte[]... columns) {

public Get(byte[] row, Collection<byte[]> columns) {


public Get(String row, String... columns) {

public Get(String row, Collection<String> columns) { 


// Delete.java

public Delete(byte[] row, byte[]... columns) {
 
public Delete(byte[] row, Collection<byte[]> columns) {

public Delete(String row, String... columns) { 

// The following method will be added for consistency (it is currently not there)
public Delete(byte[] row, Collection<String> columns) { 
  • No labels

3 Comments

  1. +1 for returning nothing when zero rows are requested. We have a different API to return all rows. 

    In my opinion, the current behavior is a bug, not an intention. 

  2. When you wrote in the above:

    // The following method will be added for consistency (it is currently not there)
    public Delete(byte[] row, Collection<byte[]> columns) { 

    did you actually mean (since that method is the second of the three preceding lines) to use a "Collection<String>" instead?:

    // The following method will be added for consistency (it is currently not there)
    public Delete(byte[] row, Collection<String> columns) { 
    1. Right; that is what I meant. Thanks.
      Updated.