Skip to content

Conversation

@VGalaxies
Copy link
Contributor

add inclusion/exclusion pattern keys for pipe source filtering


This PR was primarily authored with Codex using gpt-5.2-codex xhigh and then hand-reviewed by me. I AM responsible for every change made in this PR. I aimed to keep it aligned with our goals, though I may have missed minor issues. Please flag anything that feels off, I'll fix it quickly.

BREAKING CHANGE: legacy source.pattern/source.path accept only one rule; use source.pattern.inclusion for multi-value wildcard patterns.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces new pattern inclusion/exclusion keys (source.pattern.inclusion, source.pattern.exclusion, extractor.pattern.inclusion, extractor.pattern.exclusion) for pipe source filtering, enabling support for multiple patterns in inclusion and exclusion filters. The legacy single-pattern restriction has been removed when using the new keys, while maintaining backward compatibility for legacy keys (source.pattern, source.path) which now reject multi-pattern usage.

Changes:

  • Added new pattern inclusion/exclusion constants to PipeSourceConstant
  • Refactored pattern parsing logic in TreePattern to support both new multi-pattern keys and legacy single-pattern keys
  • Updated tests to use new keys and added validation for legacy multi-pattern rejection
  • Updated integration tests to demonstrate multi-pattern and exclusion functionality

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
PipeSourceConstant.java Added constants for new pattern inclusion keys
TreePattern.java Refactored pattern parsing to support new inclusion/exclusion keys with multi-pattern support and IoTDB pattern format
IoTDBDataRegionSource.java Updated validation to include new pattern inclusion keys
PipeDataNodeTaskAgent.java Updated memory calculation checks to include new pattern keys
IoTDBDataRegionSourceTest.java Removed tests for legacy multi-pattern usage (now rejected) and cleaned up @ignore annotation
TreePatternPruningTest.java Migrated tests from legacy keys to new inclusion keys and added tests for legacy multi-pattern rejection
IoTDBPipeInclusionIT.java Updated integration tests to use new pattern inclusion/exclusion keys
IoTDBTreePatternFormatIT.java Extensively updated integration tests demonstrating multi-pattern and exclusion functionality with new keys

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 306 to 305
public void testMultiplePrefixPatternRealtimeData() throws Exception {
final Map<String, String> sourceAttributes = new HashMap<>();
sourceAttributes.put("source.pattern", "root.db.d1.s, root.db2.d1.s");
sourceAttributes.put("source.pattern.inclusion", "root.db.d1.s*, root.db2.d1.s");
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to testMultiplePrefixPatternHistoricalData, this test name is misleading as it uses IoTDB pattern syntax with wildcards rather than prefix patterns.

Copilot uses AI. Check for mistakes.
Comment on lines 479 to 470
public void testPrefixPatternWithExclusionRealtimeData() throws Exception {
final Map<String, String> sourceAttributes = new HashMap<>();
sourceAttributes.put("source.pattern", "root.db.d1, root.db.d2");
sourceAttributes.put("source.pattern.inclusion", "root.db.d1.**, root.db.d2.**");
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the historical data test, this test name is misleading as it uses IoTDB pattern syntax rather than prefix patterns.

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +171
if (hasPatternInclusionKey && (hasLegacyPathKey || hasLegacyPatternKey)) {
final String msg =
String.format(
"Pipe: %s cannot be used together with %s or %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATTERN_KEY, SOURCE_PATH_KEY);
LOGGER.warn(msg);
throw new PipeException(msg);
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test coverage for the mutual exclusivity validation that prevents using source.pattern.inclusion together with source.pattern or source.path (lines 164-171). Consider adding a test that verifies a PipeException is thrown when these incompatible parameters are used together.

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +168
"Pipe: %s cannot be used together with %s or %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATTERN_KEY, SOURCE_PATH_KEY);
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message only mentions SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATTERN_KEY, and SOURCE_PATH_KEY but should also mention their EXTRACTOR_* equivalents for consistency. The condition checks for EXTRACTOR_PATTERN_INCLUSION_KEY, EXTRACTOR_PATH_KEY, and EXTRACTOR_PATTERN_KEY but these are not included in the error message. Consider updating the message to include all checked keys or use more generic descriptions.

Suggested change
"Pipe: %s cannot be used together with %s or %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATTERN_KEY, SOURCE_PATH_KEY);
"Pipe: %s or %s cannot be used together with %s, %s, %s or %s.",
EXTRACTOR_PATTERN_INCLUSION_KEY,
SOURCE_PATTERN_INCLUSION_KEY,
EXTRACTOR_PATTERN_KEY,
SOURCE_PATTERN_KEY,
EXTRACTOR_PATH_KEY,
SOURCE_PATH_KEY);

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +181
? parseIoTDBPatternList(
sourceParameters.getStringByKeys(
EXTRACTOR_PATTERN_INCLUSION_KEY, SOURCE_PATTERN_INCLUSION_KEY),
isTreeModelDataAllowedToBeCaptured,
true,
SOURCE_PATTERN_INCLUSION_KEY)
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new pattern inclusion keys always parse patterns as IoTDBTreePattern, completely ignoring the source.pattern.format parameter. This is inconsistent with the legacy behavior where source.pattern respects the format parameter (lines 634-678 in parsePatternsFromPatternParameter). Users who want to use prefix format patterns with the new inclusion keys will not be able to do so. Consider either: 1) Supporting source.pattern.format for the new keys, or 2) Documenting this as a breaking change and intentional design decision.

Copilot uses AI. Check for mistakes.

@Test
@Ignore("Disabled: multi/exclusion tree patterns are blocked in this branch")
public void testMultiplePrefixPatternHistoricalData() throws Exception {
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name "testMultiplePrefixPatternHistoricalData" is misleading because it uses IoTDB pattern syntax with wildcards ("root.db.d1.s*") rather than prefix patterns. The new source.pattern.inclusion key only supports IoTDB pattern format and ignores source.pattern.format. Either rename this test to reflect that it uses IoTDB patterns, or if prefix patterns were intended, this test will not work as expected.

Suggested change
public void testMultiplePrefixPatternHistoricalData() throws Exception {
public void testMultipleIoTDBPatternHistoricalData() throws Exception {

Copilot uses AI. Check for mistakes.
// Inclusion: Match everything under root.db.d1 and root.db.d2
sourceAttributes.put("source.pattern", "root.db.d1, root.db.d2");
sourceAttributes.put("source.pattern.inclusion", "root.db.d1.**, root.db.d2.**");
// Exclusion: Exclude anything with the prefix root.db.d1.s1
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Exclude anything with the prefix root.db.d1.s1" but the pattern uses IoTDB syntax, not prefix syntax. Since the new source.pattern.inclusion and source.pattern.exclusion keys only support IoTDB patterns, the exclusion "root.db.d1.s1" will match exactly "root.db.d1.s1" or "root.db.d1.s1.**" (depending on how IoTDB pattern matching works), not "anything with the prefix root.db.d1.s1". This comment should be corrected to reflect IoTDB pattern matching semantics.

Suggested change
// Exclusion: Exclude anything with the prefix root.db.d1.s1
// Exclusion: Exclude the timeseries root.db.d1.s1

Copilot uses AI. Check for mistakes.

@Test
@Ignore("Disabled: multi/exclusion tree patterns are blocked in this branch")
public void testPrefixPatternWithExclusionHistoricalData() throws Exception {
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name "testPrefixPatternWithExclusionHistoricalData" is misleading because it uses IoTDB pattern syntax with wildcards ("root.db.d1.", "root.db.d2.") rather than prefix patterns. Consider renaming to "testIoTDBPatternWithExclusionHistoricalData" or similar to accurately reflect the pattern type being tested.

Suggested change
public void testPrefixPatternWithExclusionHistoricalData() throws Exception {
public void testIoTDBPatternWithExclusionHistoricalData() throws Exception {

Copilot uses AI. Check for mistakes.
Comment on lines +201 to +210
if (hasPatternInclusionKey
&& sourceParameters.hasAnyAttributes(
EXTRACTOR_PATH_EXCLUSION_KEY, SOURCE_PATH_EXCLUSION_KEY)) {
final String msg =
String.format(
"Pipe: %s cannot be used together with %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATH_EXCLUSION_KEY);
LOGGER.warn(msg);
throw new PipeException(msg);
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test coverage for the mutual exclusivity validation that prevents using source.pattern.inclusion together with source.path.exclusion (lines 201-210). Consider adding a test that verifies a PipeException is thrown when these incompatible parameters are used together.

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +207
"Pipe: %s cannot be used together with %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATH_EXCLUSION_KEY);
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message only mentions SOURCE_PATH_EXCLUSION_KEY but should also mention EXTRACTOR_PATH_EXCLUSION_KEY for consistency with other error messages. Both keys are checked in the condition but only one is shown in the error message. Consider updating the message to include both keys or use a more generic description.

Suggested change
"Pipe: %s cannot be used together with %s.",
SOURCE_PATTERN_INCLUSION_KEY, SOURCE_PATH_EXCLUSION_KEY);
"Pipe: %s cannot be used together with %s or %s.",
SOURCE_PATTERN_INCLUSION_KEY, EXTRACTOR_PATH_EXCLUSION_KEY, SOURCE_PATH_EXCLUSION_KEY);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant