# Aliyun ODPS Plugin for LogStash

## Getting Started
---

### Introduction

- ODPS-Open Data Processing Service is a massive data processing platform designed by alibaba.

### Install the Plugin

use logstash tool to install plugin:

```
$ {YOUR_LOGSTASH_DIRECTORY}/bin/logstash-plugin install logstash-output-odpstunnel-1.0.0.gem
```

### Building the plugin

```
jruby -S gem build logstash-output-odpstunnel.gemspec
```

> A little hint: on ODPS SDK upgrade, replace jars in `vendor/jar-dependencies/runtime-jars` with `odps-sdk-core`, `odps-sdk-commons` and their corresponding dependencies.

### Sample config
```
input{
	stdin{ }
}
filter {
  alter {
    add_field => { "data" => "%{message}, from %{host}" }
	add_field => { "place" => "usa" }
	add_field => { "biginttest" => "55555" }
	add_field => { "doubletest" => "3.5" }
	add_field => { "datetimetest" => "2015-12-04 23:45:06" }
  }
}
output{
	odpstunnel{
		shard_number=>1
		aliyun_access_id=>"************"
		aliyun_access_key=>"************"
		aliyun_odps_endpoint=>"******************"
		project=>"your_projectName"
		table=>"your_tableName"
		partition=>"time=$<datetimetest.strftime('%Y-%m-%d')>,place=$<place>"
		partition_time_format=>"%Y-%m-%d %H:%M:%S"	
		value_fields=>["data","biginttest","doubletest","datetimetest"]
	}	
}
```

### Parameters
- aliyun_access_id(Required):your aliyun access id.
- aliyun_access_key(Required):your aliyun access key.
- aliyun_odps_endpoint(Required):
- project(Required):your project name.
- table(Required):your table name.
- value_field(Required): must match the keys in source.
- partition(Optional): set this if your table is partitioned.
    - partition format:
        - fix string: partition ctime=20150804
        - key words: partition ctime=$\<remote>
        - key words int time format: partition ctime=$\<datetime.strftime('%Y%m%d')>
- partition_time_format(Optional):
    - if you are using the key words to set your \<partition> and the key word is in time format, please set the param \<partition_time_format>. example: source[datetime] = "29/Aug/2015:11:10:16 +0800", and the param \<time_format> is "%d/%b/%Y:%H:%M:%S %z". If not set, the format will be automatically inferred using ruby `Time.parse`.
- batch_size(Optional): batch send message, default 100.
- batch_timeout(Optional): force to send message interval. Send messages even the queue size not reach batch size, default 1s.

### Supported ODPS Types

- String
- BigInt
- Double
- DateTime
	- The format of non-partitioning source datetime is automatically inferred using ruby `Time.parse`.
- Boolean
	- True for `any_string.lowercase() == "true"`. Any other cases is False.


## Useful Links
---

- [LogStash User Guide](https://www.elastic.co/products/logstash)

## License
---

licensed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0.html)