Spring Batch 입문 28편 — FieldSet · Flat File 의 ResultSet

2026-05-17•Spring Batch 입문에서 운영까지

Spring Batch 입문 28편. Flat File 입출력의 최소 단위 추상화 — FieldSet. JDBC ResultSet 과 유사한 type-safe 접근 (readString · readInt · readLong · readBigDecimal · readDate), index vs name 접근, null·empty 처리, format exception, DefaultFieldSet 의 SimpleDateFormat 함정까지 정리한 학습 노트.

이 글은 Spring Batch 입문에서 운영까지 시리즈 48편 중 28편이에요. 27편 의 Flat File overview 에서 파싱 3총사 를 봤다면, 이번 28편은 그 가운데 자리 — FieldSet.

FieldSet — Flat File 의 JDBC ResultSet

A FieldSet is Spring Batch's abstraction for enabling the binding of fields from a file resource. It allows developers to work with file input in much the same way as they would work with database input. A FieldSet is conceptually similar to a JDBC ResultSet. — 공식 reference

핵심 비유는 단순해요. JDBC(Java 의 DB 접근 표준 API)에서 한 row 를 다루듯, flat file 의 한 line 을 FieldSet 으로 다룹니다.

JDBC `ResultSet`	`FieldSet`
한 row	한 line
`rs.getString(0)`	`fs.readString(0)`
`rs.getString("name")`	`fs.readString("name")`
`rs.getInt("age")`	`fs.readInt("age")`
`rs.getDate("createdAt")`	`fs.readDate("createdAt")`

flat file 을 읽는 코드가 JDBC 와 같은 mental model 위에서 돌아간다는 뜻이에요.

가장 단순한 사용

String[] tokens = new String[]{"foo", "1", "true"};
FieldSet fs = new DefaultFieldSet(tokens);

String name = fs.readString(0);          // "foo"
int value = fs.readInt(1);                // 1
boolean booleanValue = fs.readBoolean(2); // true

String[] 만 있으면 DefaultFieldSet(FieldSet 의 기본 구현체)으로 감싸서 index 로 바로 접근할 수 있습니다.

Name 으로 접근 — 권장

String[] tokens = new String[]{"foo", "1", "true"};
String[] names = new String[]{"name", "count", "active"};
FieldSet fs = new DefaultFieldSet(tokens, names);

String name = fs.readString("name");
int value = fs.readInt("count");
boolean active = fs.readBoolean("active");

Name 으로 접근하면 좋은 점이 몇 가지 있어요. 컬럼 순서가 바뀌어도 코드는 그대로 두면 되고, 가독성이 올라가고, 디버깅도 쉬워집니다.

27편에서 봤던 tokenizer.setNames("id", "name", "email") 가 바로 이 FieldSet 의 name 을 자동으로 채워주는 자리였어요.

Type 변환 메서드 전체

FieldSet 의 read* 메서드 가족은 다음과 같습니다.

메서드	return	변환 규칙
`readString(int / String)`	String	그대로
`readBoolean(int / String)`	boolean	"true"/"false"
`readBoolean(int, String trueValue)`	boolean	custom true 값 (예: "Y")
`readChar(int / String)`	char	첫 글자
`readByte(int / String)`	byte	parseByte
`readShort(int / String)`	short	parseShort
`readInt(int / String)`	int	parseInt
`readInt(int, int defaultValue)`	int	empty 시 default
`readLong(int / String)`	long	parseLong
`readLong(int, long defaultValue)`	long	empty 시 default
`readFloat(int / String)`	float	parseFloat
`readDouble(int / String)`	double	parseDouble
`readBigDecimal(int / String)`	BigDecimal	new BigDecimal(token)
`readBigDecimal(int, BigDecimal default)`	BigDecimal	empty 시 default
`readDate(int / String)`	Date	기본 patterns 시도
`readDate(int, String pattern)`	Date	명시 pattern
`readDate(int, Date defaultValue)`	Date	empty 시 default

핵심 가치는 Spring Batch 가 이 변환을 표준화했다는 점이에요. batch job 마다 각자 다르게 parsing 할 일이 없어집니다.

Rather than each batch job parsing differently in potentially unexpected ways, it can be consistent, both when handling errors caused by a format exception, or when doing simple data conversions. — 공식 reference

Boolean 의 custom 값

fs.readBoolean("active", "Y")        // "Y" = true, 그 외 = false
fs.readBoolean("flag", "TRUE")       // "TRUE" = true

CSV 가 Y/N·T/F 같은 비표준 boolean 일 때 유용해요.

Date 의 pattern 지정

fs.readDate("createdAt", "yyyy-MM-dd")
fs.readDate("orderDate", "yyyyMMddHHmmss")

명시하지 않으면 기본 patterns(여러 ISO 형식)를 자동으로 시도합니다. 내부적으로는 Spring Batch 가 SimpleDateFormat(Java 의 날짜 포맷 클래스)을 사용해요.

Date 함정 — SimpleDateFormat thread-safety

SimpleDateFormat 은 thread-safe 하지 않습니다. DefaultFieldSet 내부도 마찬가지예요.

그래서 37편에서 다룰 multi-threaded Step 환경에서는 단일 FieldSet 을 여러 thread 가 공유하면 안 됩니다. 각 thread 가 자체 FieldSet 을 받는 게 정상 흐름이에요.

요즘은 LocalDate·LocalDateTime 사용을 권장하는데, FieldSet 의 readDate 는 java.util.Date 를 반환하니 변환이 한 번 필요합니다.

Date legacyDate = fs.readDate("createdAt", "yyyy-MM-dd");
LocalDate modern = legacyDate.toInstant()
    .atZone(ZoneId.systemDefault())
    .toLocalDate();

또는 custom FieldSetMapper(FieldSet 을 도메인 객체로 매핑하는 인터페이스)에서 직접 parsing 하기도 합니다.

@Override
public Customer mapFieldSet(FieldSet fs) {
    return new Customer(
        fs.readLong("id"),
        fs.readString("name"),
        LocalDate.parse(fs.readString("createdAt"))
    );
}

Null·Empty 처리

`readString` 의 null 처리

String value = fs.readString("name");

token 이 empty 거나 null 이면 "" 또는 null 이 돌아옵니다. null 과 empty 를 구분하고 싶다면 readRawString 을 쓰면 돼요.

`readInt` 의 empty

int value = fs.readInt("count");          // empty 시 NumberFormatException
int value = fs.readInt("count", 0);       // empty 시 default 0

default 인자가 있는 형태를 권장합니다. empty cell 에 어떤 값을 넣을지 코드로 명시하는 셈이에요.

`readBigDecimal` 의 empty

BigDecimal amount = fs.readBigDecimal("amount", BigDecimal.ZERO);

Format Exception 의 의미

String[] tokens = {"abc"};           // 숫자가 들어가야 하는데 문자
FieldSet fs = new DefaultFieldSet(tokens);
int value = fs.readInt(0);            // ★ throws IllegalArgumentException

FieldSet 의 read 메서드는 format error 가 나면 IllegalArgumentException 을 던집니다. 14편의 skip 로직이 이걸 catch 해서 해당 row 를 건너뛰는 구조예요.

데이터 품질 검증을 박을 자연스러운 자리가 바로 이 FieldSet 단계입니다.

DefaultFieldSet 직접 사용 (대부분 X)

FieldSet fs = new DefaultFieldSet(
    new String[]{"foo", "1", "true"},
    new String[]{"name", "count", "active"}
);

대부분의 경우 직접 생성할 일은 없어요. LineTokenizer(한 줄을 토큰 배열로 쪼개는 인터페이스)가 알아서 만들어 줍니다.

직접 생성하는 경우는 단위 테스트 작성 시거나 custom Reader 안의 변환 단계 정도예요.

FieldSet 의 메타데이터 메서드

int count = fs.getFieldCount();              // 필드 개수
String[] values = fs.getValues();            // 모든 token
String[] names = fs.getNames();              // 모든 name (없으면 빈 배열)
Properties props = fs.getProperties();       // name → value map

전체 metadata 에 접근하는 메서드들입니다.

사용 예제 — Reader 안에서

@Bean
public FieldSetMapper<Customer> customerMapper() {
    return fieldSet -> new Customer(
        fieldSet.readLong("id"),
        fieldSet.readString("name"),
        fieldSet.readString("email"),
        fieldSet.readDate("createdAt", "yyyy-MM-dd"),
        fieldSet.readBigDecimal("balance", BigDecimal.ZERO),
        fieldSet.readBoolean("active", "Y")
    );
}

각 필드를 type-safe 하게 뽑아 도메인 객체로 조립하는 흐름이에요.

Writer 측 대칭 — FieldExtractor

public interface FieldExtractor<T> {
    Object[] extract(T item);
}

Reader 의 역방향이에요. 도메인 객체에서 Object[] 를 꺼내는 인터페이스입니다. 표준 구현은 이렇게 있어요.

구현	동작
`BeanWrapperFieldExtractor`	property 이름 매칭 (getter)
`RecordFieldExtractor` (Spring Batch 5+)	Java Record component
`PassThroughFieldExtractor`	item 자체가 Object[]
Custom	복잡 변환

@Bean
public FieldExtractor<Customer> customerExtractor() {
    BeanWrapperFieldExtractor<Customer> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames(new String[]{"id", "name", "email"});
    return extractor;
}

Reader · Writer 의 mapping 흐름

[Reader 측]
String line = "1,Alice,..."
  ↓ LineTokenizer
FieldSet
  ↓ FieldSetMapper
Customer

[Writer 측 — 역방향]
Customer
  ↓ FieldExtractor
Object[]
  ↓ LineAggregator
String line = "1,Alice,..."

Reader 의 3 컴포넌트(Tokenizer · FieldSet · FieldSetMapper)와 Writer 의 3 컴포넌트(FieldExtractor · Object[] · LineAggregator, 필드 배열을 한 줄 문자열로 합치는 인터페이스)가 거울처럼 대칭을 이룹니다.

자주 만나는 사고

사고 1: ArrayIndexOutOfBoundsException

원인 — token 개수보다 큰 index 나 존재하지 않는 name 으로 접근.

해결 — getFieldCount() 로 미리 검사하거나 try-catch 로 감쌉니다.

사고 2: Empty cell 의 default

원인 — readInt("count") 가 empty cell 을 만나면 NumberFormatException.

해결 — readInt("count", 0) 처럼 default 를 지정해 줍니다.

사고 3: Date format 불일치

원인 — 기본 patterns 가 파일 안 date 형식과 안 맞을 때.

해결 — readDate("date", "yyyy-MM-dd HH:mm:ss") 로 pattern 을 명시합니다.

사고 4: BigDecimal 의 천 단위 구분자

원인 — "1,234,567.89" 안의 , 가 split 단계에서 잘려 나가는 경우.

해결 — 읽을 때 quoteCharacter 를 걸거나, parsing 단계에서 NumberFormat 이나 custom mapper 로 처리합니다.

사고 5: 잘못된 name 접근

원인 — fs.readString("nmae") 같은 오타.

해결 — name 을 상수화해 둡니다. public static final String NAME_FIELD = "name" 처럼요.

사고 6: Boolean 의 "true"/"false" 비표준

원인 — Y/N 이나 1/0 으로 들어오는 데이터.

해결 — readBoolean("active", "Y") 처럼 custom true 값을 넘깁니다.

사고 7: SimpleDateFormat thread-safety

원인 — 공유된 FieldSet 의 readDate 가 여러 thread 에서 호출되는 상황.

해결 — Spring Batch 의 정상 흐름(각 thread 가 자체 FieldSet 을 받는 구조)을 유지하고 직접 공유하지 않습니다.

운영 권장 패턴

Pattern 1: 안전한 default 값

public Customer mapFieldSet(FieldSet fs) {
    return new Customer(
        fs.readLong("id"),
        fs.readString("name"),
        fs.readString("email"),
        fs.readBigDecimal("balance", BigDecimal.ZERO),
        fs.readBoolean("active", "Y"),
        fs.readInt("loginCount", 0)
    );
}

empty cell 에 대한 default 를 코드로 명시하면 데이터 불완전성을 자연스럽게 흡수할 수 있어요.

Pattern 2: LocalDate 변환 단계 박기

public Order mapFieldSet(FieldSet fs) {
    return new Order(
        fs.readLong("id"),
        fs.readBigDecimal("amount"),
        parseLocalDate(fs.readString("orderDate"))
    );
}

private LocalDate parseLocalDate(String raw) {
    if (raw == null || raw.isBlank()) return null;
    return LocalDate.parse(raw, DateTimeFormatter.ISO_LOCAL_DATE);
}

java.util.Date 를 우회하고 모던 시간 API 로 바로 받는 구조예요.

Pattern 3: Validation 통합

public Customer mapFieldSet(FieldSet fs) {
    long id = fs.readLong("id");
    if (id <= 0) throw new IllegalStateException("Invalid id: " + id);

    String email = fs.readString("email");
    if (!email.contains("@")) throw new IllegalStateException("Invalid email: " + email);

    return new Customer(id, fs.readString("name"), email);
}

mapper 안에서 예외를 던지면 14편의 skip 로직이 받아 줍니다.

Pattern 4: 단위 테스트

@Test
void readsAllFields() {
    FieldSet fs = new DefaultFieldSet(
        new String[]{"1", "Alice", "alice@example.com"},
        new String[]{"id", "name", "email"}
    );

    Customer customer = mapper.mapFieldSet(fs);

    assertEquals(1L, customer.getId());
    assertEquals("Alice", customer.getName());
}

DefaultFieldSet 을 직접 생성하면 mapper 의 단위 테스트가 쉬워져요.

Java Record 와 FieldSet — Spring Batch 5+

public record Customer(Long id, String name, String email) {}

@Bean
public FieldSetMapper<Customer> recordMapper() {
    return new RecordFieldSetMapper<>(Customer.class);
}

RecordFieldSetMapper(Record 컴포넌트 이름 기반 매퍼)는 Record component 이름으로 FieldSet 을 매핑합니다. setter 없이 불변 도메인 객체를 안전하게 만들 수 있어요.

.targetType(Customer.class)        // Record 도 OK

FlatFileItemReaderBuilder(FlatFileItemReader 를 빌더로 조립하는 클래스)의 .targetType() 도 Record 를 자동 인식합니다.

시험 직전 한 번 더 — FieldSet 함정 압축 노트

FieldSet = Flat File 의 JDBC ResultSet 유사 추상화
String[] tokens + 선택적 String[] names
Index 접근 — fs.readString(0)·fs.readInt(1)
Name 접근 — fs.readString("name") (권장)
read 메서드 가족 — readString·readBoolean·readChar·readByte·readShort·readInt·readLong·readFloat·readDouble·readBigDecimal·readDate
default 값 인자 = readInt("count", 0) — empty cell 안전 처리
boolean custom true — readBoolean("active", "Y")
Date pattern — readDate("date", "yyyy-MM-dd")
format error = IllegalArgumentException → 14편 skip 로직 흡수
null vs empty = readString 의 동작, readRawString 으로 구분
메타데이터 메서드 = getFieldCount·getValues·getNames·getProperties
Writer 측 대칭 = FieldExtractor (BeanWrapper · Record · PassThrough · Custom)
mapping 흐름 — Reader: line → Tokenizer → FieldSet → FieldSetMapper → 도메인 / Writer: 도메인 → FieldExtractor → Object[] → LineAggregator → line
DefaultFieldSet 직접 사용 = 테스트 또는 custom reader
함정 — ArrayIndexOutOfBoundsException (token 부족) → getFieldCount 검사
함정 — empty cell NumberFormatException → default 지정
함정 — Date format 불일치 → pattern 명시
함정 — BigDecimal 의 천 단위 구분자 → quoteCharacter 또는 custom parsing
함정 — name 오타 → name 상수화
함정 — SimpleDateFormat thread-safety → Spring Batch 정상 흐름 유지
함정 — Y/N·1/0 같은 비표준 boolean → custom true 값
권장 — LocalDate 변환 단계, default 값 명시, validation throw + skip 흡수
Spring Batch 5+ — RecordFieldSetMapper 로 Java Record 매핑
.targetType(Customer.class) 가 Record 도 자동 인식

공식 문서: The FieldSet 에서 원문을 확인할 수 있어요.

시리즈 다른 편 (앞뒤 글 모음)

이전 글:

다음 글:

※ 이 포스팅은 쿠팡 파트너스 활동의 일환으로, 이에 따른 일정액의 수수료를 제공받습니다.